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Preface 


My goal for this book is to provide a friendly concise introduction to algebra with empha- 
sis on its uses in the modern world - including a little history, concrete examples, and 
visualization. Beyond explaining the basics of the theory of groups, rings, and fields, I aim 
to give many answers to the question: What is it good for?” The standard undergraduate 
mathematics course in the 1960s (when I was an undergraduate) proceeded from Definition 
1.1.1 to Corollary 14.5.59 with little room for motivation, examples, history, and applica- 
tions. I plan to stay as far as possible from that old format, modeling my discussion on 
G. Strang’s book [115], where the preface begins: “I believe that the teaching of linear 
algebra has become too abstract.” My feeling is that the teaching of modern algebra (the 
non-linear part) has become even more abstract. I will attempt to follow Strang’s lead and 
treat modern algebra in a way that will make sense to a large variety of students. On the 
other hand, the goal is to deal with some abstractions - groups, rings, and such things. Yes, 
it is abstraction and generalization that underlies the power of mathematics. Thus there will 
be some conflict between the applied and pure aspects of our subject. 

The book is intended for a year-long undergraduate course in algebra. The intended 
audience is the less theoretically inclined undergraduates majoring in mathematics, the 
physical and social sciences, or engineering - including those in applied mathematics or 
those intending to get a teaching credential. 

The prerequisites are minimal: comfort with the real numbers, the complex numbers, 
matrices, vector spaces at the level of calculus courses - and a bit of courage when asked 
to do a short proof. 

In this age of computers, algebra may have replaced calculus (analysis) as the most 
important part of mathematics. For example: 


1. Error-correcting codes are built into the DVD player and the computer. Who do you call 
to correct errors? The algebraist, that’s who! 

2. Digital signal processing (such as that involved in medical scanners, weather prediction, 
the search for oil) is dominated by the fast Fourier transform or FFT. What is this? The 
FFT is a finite sum whose computation has been sped up considerably by an algebra 
trick which goes back to Gauss in 1805. Once more algebra, not analysis, rules. 

3. In chemistry and physics, one studies structures with symmetry such as the benzene 
molecule (C¢He) depicted in Figure 0.1. What does the 6-fold symmetry have to do with 
the properties of benzene? Group theory is the tool one needs for this. 

4. The search for secret codes - cryptography. Much of the modern world - particularly 
that which lives on the internet - depends on these codes being secure. But are they? 
We will consider public-key codes. And who do you call to figure out these codes? An 
algebraist! 
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5. The quest for beauty in art and nature. I would argue that symmetry groups are necessary 
tools for this quest. See Figures 0.2-0.4. 


Figure 0.1 Benzene CeHe 


Figure 0.2 Photoshopped flower 


Figure 0.3 Hibiscus in Kauai 


Our goal here is to figure out enough group and ring theory to understand many of 
these applications. And we should note that both algebra and analysis are necessary for 
the applications. In fact, we shall see some limits, derivatives, and integrals before the last 
pages of this book. You can skip all the applications if you just want to learn the basics 
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Figure 0.4 Picture with symmetry coming from the action of 2 x 2 matrices with nonzero 
determinant and elements in a finite field with 11 elements 


of modern algebra. The non-applied sections are all independent of those on applications. 
However, you would be missing one of the big reasons that the subject is taught. And feel 
free to skip any applied sections you want to skip - or to add any missing application that 
you want to understand. 

The first part of this book covers groups, after some preliminaries on sets, functions, rela- 
tions, the integers, and mathematical induction. Of course every calculus student is familiar 
with the group of real numbers under addition and similarly with the group of nonzero real 
numbers under multiplication. We will consider many more examples - favorites being 
finite groups such as the group of symmetries of an equilateral triangle. 

Much of our subject began with those favorite questions from high school algebra such 
as finding solutions to polynomial equations. It took methods of group theory to know 
when the solutions could not be found in terms only of nth roots. Galois, who died at age 
21 in a duel in 1832, laid the foundations to answer such questions by looking at groups of 
permutations of the roots of a polynomial. These are now called Galois groups. See Edna E. 
Kramer [59, Chapter 16] or Ian Stewart [114] for some of the story of Galois and the history 
of algebra. Another reference for stories about Galois and the many people involved in the 
creation of this subject is Men of Mathematics by Eric Temple Bell [8]. This book is often 
criticized for lack of accuracy, but it is more exciting than most. I found it inspirational as 
an undergraduate - despite the title. 

Another area that leads to our subject is number theory: the study of the ring of integers 


Z = {0, +1, +2, +3,...}. 


The origins of this subject go back farther than Euclid’s Elements. Euclid lived in Alexandria 
around 300 sc and his book covers more than the plane geometry we learned in high school. 
Much of the basic theory of the integers which we cover in Chapter 1 is to be found in 
Euclid’s Elements. Why is it that the non-plane geometry part of Euclid’s Elements does not 
seem to be taught in high school? 
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Polynomial equations with integer coefficients are often called Diophantine equations in 
honor of Diophantus who also lived in Alexandria, but much later (around Ap 200). Yes, 
algebra is an old subject and one studied in many different countries. For example, the 
name “algebra” comes the word al-jabr, one of the two operations used to solve quadratic 
equations by the Persian mathematician and astronomer Mohamed ibn Musa al-Khwarizmi, 
who lived around ap 800. 

A large part of this subject was created during many attempts to prove Fermat’s last 
theorem. This was a conjecture of Pierre de Fermat in 1637 stating that the equation 
x’ + y"=z" can have no integer solutions x,y, z with ryzA 0 and n > 2. Fermat claimed to 
have a proof that did not fit in the margin of the book in which he wrote this conjecture. 
People attempted to prove this theorem without success until A. Wiles with the help of 
R. Taylor in 1995. People still seek an "elementary” proof. 

Groups are sets with one operation satisfying the axioms to be listed in Section 2.1. 
After the basic definitions, we consider examples of small groups. We will visualize groups 
using Cayley graphs and various other diagrams such as Hasse or poset diagrams as well 
as cycle diagrams. Other topics of study include subgroups, cyclic groups, permutation 
groups, functions between groups preserving the group operations (homomorphisms), cosets 
of subgroups, building new groups whose elements are cosets of normal subgroups, direct 
products of groups, actions of groups on sets. We will consider such applications as public- 
key cryptography, the finite Fourier transform, and the chemistry of benzene. Favorite 
examples of groups include cyclic groups, permutation groups, symmetry groups of the 
regular polygons, matrix groups such as the Heisenberg group of 3 x 3 upper triangu- 
lar matrices with real entries and 1 on the diagonal, the group operation being matrix 
multiplication. 

The second part of this book covers rings and fields. Rings have two operations satisfying 
the axioms listed in Section 5.2. We denote the two operations as addition + and multipli- 
cation « or -. The identity for addition is denoted 0. It is NOT assumed that multiplication 
is commutative: that is, it is not assumed that ab= ba. If multiplication is commutative, 
then the ring is called commutative. A field F is a commutative ring with an identity 
for multiplication (called 140) such that the nonzero elements of F form a multiplicative 
group. Most of the rings considered here will be commutative. We will be particularly inter- 
ested in finite fields like the field of integers modulo p, Z/pZ where p is prime. You must 
already be friends with the field Q of rational numbers - fractions with integer numer- 
ator and nonzero integer denominator. And you know the field R of real numbers from 
calculus: that is, limits of Cauchy sequences of rationals. We are not supposed to say the 
word “limit” as this is algebra. So we will not talk about constructing the field of real 
numbers R. Ring theory topics include: definitions and basic properties of rings, fields, 
ideals, and functions between rings which preserve the ring operations (ring homomor- 
phisms). We will also build new rings (quotient rings) whose elements are cosets ++ I of 
an ideal J in ring R, for x in ring R. Note that here R is an arbitrary ring, not necessarily 
the field of real numbers. We will look at rings of polynomials and their similarity to the 
ring of integers. We can do linear algebra for finite-dimensional vector spaces over arbi- 
trary fields in a similar way to the linear algebra that is included in calculus sequences. 
Our favorite rings are the ring Z of integers and the quotient ring Z/nZ of integers mod- 
ulo n, in which x, is identified with all integers of the form x + nk, for integer k. Another 
favorite is the ring of Hamilton quaternions which is isomorphic to four-dimensional space 
over the real numbers with basis 1,i,j,k and with multiplication defined by ij= k=—ji, 
Psy =k = -1, 
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Historically, much of our subject came out of number theory and the desire to prove 
Fermat’s last theorem by knowing about factorization into irreducibles in rings like 
Z |,/m| = {a+ b\/m| a,b€Z}, where m is a non-square integer. For example, it turns 
out that, when m=—5, we have two different factorizations: 


229=(1-/=5) 6 (1 = 4/55). 


So the fundamental theorem of arithmetic - true for Z as is shown in Section 1.5 - is false 
for Z[/—5]. 

Assuming that such factorizations were unique, Lamé thought that he had proved 
Fermat’s last theorem in 1847. Dedekind fixed up arithmetic in such rings by develop- 
ing the arithmetic of ideals, which are certain sets of elements of the ring to be considered 
in Section 5.4. One then had (at least in rings of integers in algebraic number fields) unique 
factorization of ideals as products of prime ideals, up to order. Of course, Lamé’s proof of 
Fermat’s last theorem was still invalid (lame - sorry for that). 

The favorite field of the average human mathematics student is the field of real numbers 
R. A favorite finite field for a computer is F, = Z/pZ, where p= prime. Of course you can 
define Z/nZ, for any positive integer n, but you only get a ring and not a field if n is not 
a prime. We consider Z/nZ as a group under addition in Section 1.6. In Chapter 5 we view 
it as a ring with two operations, addition and multiplication. 

Finite rings and fields were really invented by Gauss (1801) and earlier Euler (1750). 
Galois and Abel worked on field theory to figure out whether nth degree polynomial equa- 
tions are solvable with equations involving only radicals ¥/a. In fact, finite fields are often 
called “Galois fields.” 

The terminology of algebra was standardized by mathematicians such as Richard 
Dedekind and David Hilbert in the late 1800s. Much of abstract ring theory was devel- 
oped in the 1920s by Emmy Noether. Discrimination against both women and Jews made 
it hard for her to publish. The work became well known thanks to B. L. Van der Waerden’s 
two volumes [124] on modern algebra. Van der Waerden wrote these books after study- 
ing with Emmy Noether in 1924 in G6ttingen. He had also heard lectures of Emil Artin in 
Hamburg earlier. See Edna E. Kramer [59] for more information on Noether and the other 
mathematicians who developed the view of algebra we are aiming to present. 

The abstract theory of algebras (which are special sorts of rings) was applied to group 
representations by Emmy Noether in 1929. This has had a big impact on the way peo- 
ple do harmonic analysis, number theory, and physics. In particular, certain adelic group 
representations are central to the Wiles proof of Fermat’s last theorem. 

It would perhaps shock many pure mathematics students to learn how much algebra is 
part of the modern world of applied mathematics - both for good and ill. Google’s motto: 
“Don't be evil,” has not always been the motto of those using algebra. Of course, the Google 
search engine itself is a triumph of modern linear algebra, as we shall see. 

We will consider many applications of rings in Chapter 8. Section 8.1 concerns random 
number generators from finite rings and fields. These are used in simulations of natural 
phenomena. In prehistoric times like the 1950s sequences of random numbers came from 
tables like that published by the Rand corporation. Random numbers are intrinsic to Monte 
Carlo methods. These methods were named for a casino in Monaco by J. von Neumann 
and S. Ulam in the 1940s while working on the atomic bomb. Monte Carlo methods are 
useful in computational physics and chemistry (e.g., modeling the behavior of galaxies, 
weather on earth), engineering (e.g., simulating the impact of pollution), biology (simulating 
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the behavior of biological systems such as a cancer), statistics (hypothesis testing), game 
design, finance (e.g., to value options, analyze derivatives - the very thing that led to the 
horrible recession/depression of 2008), numerical mathematics (e.g., numerical integration, 
stochastic optimization), and the gerrymandering of voting districts. 

In Section 8.2 we will show how the finite field with two elements and vector spaces 
over this field lead to error-correcting codes. These codes are used in DVDs and in the 
transmission of information between a Mars spacecraft and NASA on the earth. Section 8.3 
concerns (among other things) the construction of Ramanujan graphs which can provide 
efficient communication networks. 

Section 8.4 gives applications of the eigenvalues of matrices to Googling. Section 8.5 
gives applications of elliptic curves over finite fields to cryptography. 

The rush to abstraction of twentieth-century mathematics has had some odd conse- 
quences. One of the results of the abstract ring theory approach was to create such an 
abstract version of Fourier analysis that few can figure out what is going on. A similar 
thing has happened in number theory. On the other hand, modern algebra has often made 
it easier to see the forest for the trees by simplifying computations, removing subsubscripts, 
doing calculations once in the general case rather than many times, once for each example. 

The height of abstraction was achieved in the algebra books of Nicolas Bourbaki (really 
a group of French mathematicians). I am using the Bourbaki notation for the fields of real 
numbers, complex numbers, rational numbers, and the ring of integers. But Bourbaki seems 
to have disliked pictures as well as applications. I do not remember seeing enough examples 
or history when I attempted to read Bourbaki’s Algebra as an undergrad. In an interview, one 
of the members of Bourbaki, Pierre Cartier (see Marjorie Senechal [102]) said: “The Bourbaki 
were Puritans, and Puritans are strongly opposed to pictorial representations of truths of 
their faith.” More information on Bourbaki as well as other fashions in mathematics can be 
found in Edna E. Kramer’s history book [59]. She also includes a brief history of women in 
mathematics as well as the artificial separation between pure and applied mathematics. 

As I said in my statement of goals, I will attempt to be as non-abstract as possible in this 
book and will seek to draw pictures in a subject where few pictures ever appear. I promise 
to give examples of every concept, but hope not to bury the reader in examples either, since 
I do aim for brevity. As I am a number theorist interested in matrix groups, there will be 
lots of numbers and matrices. Each chapter will have many exercises. It is important to do 
them - or as many of them as you can. Some exercises will be needed later in the book. The 
answers (mostly sketchy outlines) to odd-numbered exercises will be online hopefully. See 
my website. There may also be hints on others. No proof is intended to be very long. The 
computational problems might be slightly longer and sometimes impossible without the 
help of a computer. I will be using Mathematica, Scientific Workplace, and Group Explorer 
to help with computations. 


Suggestions for Further Reading 


A short list of possible references is: Garrett Birkhoff and Saunders Maclane [9], Larry L. 
Dornhoff and Franz E. Hohn [25], David S. Dummit and Richard M. Foote [28], Gertrude 
Ehrlich [29], John B. Fraleigh [32], Joseph A. Gallian [33], William J. Gilbert and W. Keith 
Nicholson [35], Israel N. Herstein [42], Audrey Terras [116]. There is also a free program: 
Group Explorer, which you can download and use to explore small groups. Another free but 
harder to use program is SAGE. I will be using Mathematica and Scientific Workplace. The 
Raspberry Pi computer ($35) comes with Mathematica (and not much else). There are many 
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books on line as well. One example is Judson [50]. It includes computer exercises using 
SAGE. An on line group theory book making use of the Group Explorer program is that of 
Carter [12]. Wikipedia is often very useful - or just asking Google to answer a question. It 
is easier to be a student now than it was in my time - thanks to the multitude of resources 
to answer questions. On the other hand, it was nice just to have the one small book - in 
my case, Birkhoff and Maclane [9] - to deal with. And - perhaps needless to say - online 
sources can lie. Even the computer can lie - witness the arithmetic error in the Pentium 
chip that was revealed by number theorists’ computations in the 1990s. But I have found 
that Wikipedia is usually very useful in its discussions of undergraduate mathematics, as is 
the mathematical software I have used. 

It is often enlightening to look at more than one reference. Where something is mumbled 
about in one place, that same thing may be extremely clearly explained in another. Also 
feel free to read a book in a non-linear manner. If you are interested in a particular result 
or application, start there. 
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1.1 Introduction 


Notation. From now on, we will often use the abbreviations: 


=> implies 

—— is implied by 
iff (or <==) if and only if 
Vv for every 


= there exists 
Z,Q,R,C the integers, rationals, reals, complex numbers, respectively 


We will not review the basics of proofs here. Hopefully you have figured out the basics, 
either from a high school plane geometry class or a college class introducing the subject 
of mathematical proof. See K. H. Rosen [93] for an introduction to proof. We will discuss 
proof by mathematical induction soon. There is an interesting book [60] by Steven Krantz 
on the subject of proof. Edna E. Kramer's history book [59] gives more perspective on the 
subject of proof. Another place to find a discussion of mathematical proof is Wikipedia. A 
cautionary tale concerns K. Gédel’s incompleteness theorems from 1931, the first of which 
says that for any consistent formal system for the positive integers Z*, there is a statement 
about Z* that is unprovable within this system. 

There are those who argue against proofs. I have heard this at conferences with physicists. 
Nature will tell us the truth of a statement they argue. Ramanujan felt the goddess would 
inspire him to write true formulas. However, I have no such help myself and really need 
to see a proof to know what is true and what is false. This makes me very bad at real life, 
where there is rarely a proof of any statement. Thus I have grown to be happier writing an 
algebra book than a book on politics. 

If you need more convincing about the need for proofs, look at the following two exer- 
cises, once you know what a prime is — an integer p> 1 such that p= ab, with positive 
integers a,b implies either a or b = 1. These exercises are silly if you can use your computer 
and Mathematica or some other similar program. 


Exercise 1.1.1 Show that x* —x+41 is prime for all integers x such that 0 <x < 40, but is 
not a prime when x=41. Feel free to use a computer. 


Number theory has multitudes of statements like that in Exercise 1.1.1 that have been 
checked for a huge number of cases, but yet fail to be true in all cases. Of course, now 
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computers can do much more than the puny 41 cases in the preceding exercise. For exam- 
ple, Mersenne primes are primes of the form M, = 2? — 1, where p is a prime. Mersenne 
compiled a list of Mersenne primes in the 1600s, but there were some mistakes after p = 31. 
Much computer time has been devoted to the search for these primes. Always bigger ones 
are found. In January, 2017 the biggest known prime was found to be M77232917. It is 
conjectured that there are infinitely many Mersenne primes, but the proof has eluded 
mathematicians. See Wikipedia or Shanks [103] for more information on this subject and 
other unsolved problems in number theory. Wikipedia notes that these large primes have 
a cult following - moreover they have applications to random number generators and 
cryptography. 

In the 1800s - before any computers existed - there was a conjecture by E. C. Catalan 
that My, is prime, assuming that M, is a Mersenne prime. Years passed before Catalan’s 
conjecture was shown to be false. In 1953 the ILLIAC computer (after 100 hours of com- 
puting) showed that My, is not prime when p= 11. My, is prime for p=2,3,5,7. It was 
subsequently found that the conjecture is false for p= 17, 19, 31 as well. The next case is 
too large to test at the moment. Wikipedia conjectures that the four known My, that are 
prime are the only ones. Anyway, hopefully, you get the point that you can find a large 
number of cases of some proposition that are true without the general proposition being 
valid. Stark gives many more examples in the introduction to [110]. 


Exercise 1.1.2 (Mersenne Primes). Show that 2? —1 is prime for p= 2,3,5,7, 13, but not 
for p=11. 


Hint. The Mathematica command below will do the problem for the first 10 primes. 


Table [{Prime[n] , FactorInteger [(2*Prime[n] )-1]},{n,1,10}] 


We assume that you can write down the converse of the statement “proposition A implies 
proposition B.” Yes, it is “proposition B implies proposition A.” Recall that A => B is not 
equivalent to B => A. However A => B is equivalent to its contrapositive: (not B) => 
(not A). 

We will sometimes use proof by contradiction. There are those who would object. In proof 
by contradiction of A => B we assume A and (not B) and deduce a contradiction of the 
form R and (not R). Those who would object to this and to any sort of “non-constructive” 
proof have a point, and so we will try to give constructive proofs when possible. See Krantz 
[60] for a bit of the history of constructive proofs in mathematics. 

It is also possible that you can prove something that may at first be unbelievable. See the 
exercise below, which really belongs to an analysis course covering the geometric series - 
the formula for which follows from Exercise 1.4.7 below. If you accept the axioms of the 
system of real numbers, then you have to believe the formula. 


Exercise 1.1.3 Show that 0.999999... =1. 
Hint. See Exercise 1.4.7. The ... conceals an infinite series. 
A controversial method of proof is proof by computer. First you have to believe that 


the computer has been programmed correctly. This has not always been the case; e.g., the 
problem with the Pentium chip. Here I will choose to believe what my computer tells me 
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when I use Mathematica to say whether an integer is a prime, or when used to compute 
eigenvalues of matrices, or graphs of functions, or to multiply elements of finite fields. There 
are more elaborate computer proofs that are hard to verify without even faster computers 
than the cheap laptop (vintage 2011) that I am using - for example, the proof of the four 
color problem in the 1970s or the recent proof of the Kepler conjecture on the densest 
packing of spheres in 3-space. See Krantz [60] for more information. 

We are also going to assume that you view the following types of numbers as old friends: 


the integers Z= {0,+1,+2,+3,...}, 
the rationals Q={2@|m,nEZ,nAo}, 
the reals R= {all decimals}, 


the complex numbers C= {x+iy|x,ye€R}, fori=V—1. 


We will list the axioms for Z in Chapter 1 and will construct Q from Z in Chapter 6. Of 
course, the construction of Q from Z just involves the algebra of fractions and could be 
done in Chapter 1 - minus the verbiage about fields and integral domains. We should define 
the real numbers as limits of Cauchy sequences of rationals rather then to say real numbers 
are represented by all possible decimals, but that would be calculus and we won’t go there. 
Such a construction can be found for example in the book by Leon Cohen and Gertrude 
Ehrlich [17]. A serious student should really prove that Z, Q, R, and C exist by constructing 
them from scratch, sort of like a serious chef makes a pie, but we will not do that here. 

In contemplating the lower rows of our table of number systems, philosophers have 
found their hair standing on end. Around 500 sc the Pythagoreans were horribly shocked 
to find that irrational numbers like \/2 existed. You will be asked to prove that J 2¢ Qin 
Section 1.6. What was the problem for the Pythagoreans? You can read about it in Shanks 
[103, Chapter III]. What would they have thought about transcendental numbers like 7? 
Later the complex numbers were so controversial that people called numbers like i= //—1 
“imaginary.” Non-Euclidean geometry was so upsetting that Gauss did not publish his work 
on the subject. 


Warning. This course is like a language course. It is extremely important to memorize the 
vocabulary - the definitions. If you neglect to do this, after a week or so, the lectures - or 
the reading - will become meaningless. One confusing aspect of the vocabulary is the use of 
everyday words in a very different but precise way. Then one needs the axioms, the rules of 
constructing proofs. Those are our rules of grammar for the mathematical language. These 
too must be memorized. We should perhaps add that it is folly to argue with the definitions 
or axioms - unless you have found the equivalent of non-Euclidean geometry. To some our 
subject appears arcane. But they should remember that it is just a language — there is no 
mystery once you know the vocabulary and grammar rules. 

Practice doing proofs. This means practice speaking or writing the language. One can 
begin by imitating the proofs in the text or other texts or those given by your professor. It 
is important to practice writing proofs daily. In particular, one must do as many exercises 
as possible. If your calculus class did not include proofs, this may be something of a shock. 
Mathematics seemed to be just calculations in those sad proof-less classes. And we will 
have a few calculations too. But the main goal is to be able to derive “everything” from 
a few basic definitions and axioms - thus to understand the subject. One can do this for 
calculus too. That is advanced calculus. If you do not practice conversations in a language, 
you are extremely unlikely to become fluent. The same goes with mathematics. 
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You should also be warned that sometimes when reading a proof you may doubt a state- 
ment and then be tempted to stop reading. Sadly, often the next sentence explains why that 
unbelievable statement is true. So always keep reading. This happens to “real” mathemati- 
cians all the time so do not feel bad. I have heard a story about a thesis advisor who told 
a student he did not understand the proof of a lemma in the student’s thesis. The student 
almost had a heart attack worrying about that important lemma. But it turned out that the 
advisor had not turned the page to find the rest of the proof. 

Our second goal is to apply the algebra we derive so carefully. We will not be able to 
go too deeply into any one application, but hopefully we will give the reader a taste of 
each one. 


1.2 Sets 


We first review a bit of set theory. Georg Cantor (1845-1918) developed the theory of infinite 
sets. It was controversial. There are paradoxes for those who throw caution to the winds 
and consider sets whose elements are sets. For example, consider Russell’s paradox. It was 
stated by B. Russell (1872-1970). We use the notation: + € S to mean that x is an element 
of the set S; x ¢ S means x is not an element of the set S. The notation {|x has property P} 
is read as the set of x such that x has property P. Consider the set X defined by 


X={sets S| S¢ S}. 


Then X € X implies X ¢ X and X¢ X implies X € X. This is a paradox. The set X can neither 
be a member of itself nor not a member of itself. There are similar paradoxes that sound 
less abstract. Consider the barber who must shave every man in town who does not shave 
himself. Does the barber shave himself? A mystery was written inspired by the paradox: The 
Library Paradox by Catharine Shaw. There is also a comic book about Russell, Logicomix 
by A. Doxiadis and C. Papadimitriou (see [26]). A nice reference for set theory illustrated 
by pictures and stories is the book by Vilenkin [122]. 

We will hopefully avoid paradoxes by restricting consideration to sets of numbers, vec- 
tors, and functions. This would not be enough for “constructionists” such as Errett Bishop 
who was on the faculty at the University of California San Diego. until his premature death. 
I am still haunted by his probing questions of colloquium speakers. Anyway, for applied 
mathematics, one can hope that paradoxical sets and barbers do not appear. Thus we will 
be using proof by contradiction, as we have already promised. 

Most books on calculus do a little set theory. We assume you are familiar with the notation 
which we are about to review. We will draw pictures in the plane. We write A C B (or BD A) 
if A is a subset of B: that is, x€ A implies x € B. We might also say B contains A. If AC B, 
the complement of Ain Bis B— A={xe B|x ZA }.! The empty set is denoted @. It has no 
elements. The intersection of sets A and B is 


AN B={x|x€A and xe B}. 
The union of sets A and B is 


AU B={r|xEA orxeB}. 


1 We will not use the other common notation B/A for set complement since it conflicts with our later 
notation for quotient groups. 
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Here - as is usual in mathematics - “or” means either or both. See Figure 1.1. Sets A and 
B are said to be disjoint iff AN B= @. 


A Figure 1.1 Intersection and union of square A and 
heart B 


Intersection of A and B 
is pink 


Union of A and B 
is purple 


The easiest way to do the following exercises on the equality of various sets is to show 
first that the set on the left is contained in the set on the right and second that the set on 
the right is contained in the set on the left. 


Exercise 1.2.1 


(a) Prove that 
A-(BUC)=(A-B)N(A-©. 
(b) Prove that 


A-(BNC)=(A-B)U(A-©). 


Exercise 1.2.2 Prove that AU (BU C)=(AUB)UC. Then prove the analogous equation 
with U replaced by /. 


Exercise 1.2.3 Prove that AN (BU C)=(AN B)U(AN QC. 


Definition 1.2.1 If A and B are sets, the Cartesian product of A and B is the set of 
ordered pairs (a,b) with ac A and be B: that is, 


A x B={(a,b) | ac A, b EB}. 


It is understood that we have equality of two ordered pairs (a,b) = (c,d) iff a=c and 
b=d. 
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Example 1. Suppose A and B are both equal to the set of all real numbers; A = B= R. Then 
A XxX B=R x R=Rz’. That is, the Cartesian product of the real line with itself is the set of 
points in the plane. A 


Example 2. Suppose C is the interval [0, 1] and D is the set consisting of the point {2}. 


Then C x D is the line segment of length 1 at height 2 in the plane. See Figure 1.2 
below. A 


Figure 1.2 Cartesian product [0,1] x {2} 


Of course you can also define the Cartesian product of any number of sets - even an 
infinite number of sets. We mostly restrict ourselves to a finite number of sets here. Given 
n sets Si, i€ {1,2,...,n}, define the Cartesian product S; x S2 x --+ x S, to be the set of 
ordered n-tuples (51, 52,..., Sn) with s; € Sj, for all i=1,2,...,n. 


Example 3. [0,1] x [0,1] x [0, 1]=[0, 1]? is the unit cube in 3-space. See Figure 1.3. A 


Figure 1.3 [0, 1]? 


Example 4. [0,1] x [0, 1] x [0, 1] x [0, 1] =[0, 1]* is the four-dimensional cube or tesseract. 
Draw it by “pulling out” the three-dimensional cube. See T. Banchoff [6]. Figure 1.4 below 
shows the edges and vertices of the four-dimensional cube or tesseract (actually more of 
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a 4-rectangular solid) as drawn by Mathematica. Of course both Figures 1.3 and 1.4 are 
really projections of the cube and hypercube onto the plane. A 


Figure 1.4 Graph representing the hypercube 
(0, 1]° 


Exercise 1.2.4 Show that A x (BN C)=(A x B)N(AxC). Does the same equality hold 
when you replace ) with U? 


Exercise 1.2.5 State whether the following set-theoretic equalities are true or false and give 
reasons for your answers. 


(a) {2,5,7} ={5, 2,7}. 
(b) {(2,1), (2, 3)} ={(1, 2), (3, 2)}. 
(c) 0= {0}. 


Exercise 1.2.6 Prove the following set-theoretic identities: 


(a) (A-—C)N(B-C)=(ANB)-C 
(b) Ax (B-—C)=(A x B) - (Ax C). 


1.3 The Integers 


Notation. 


Be A152 9A a scah the positive integers 
Z {0,+1,+2,+3,...} the integers 


We assume that you have been familiar with the basic facts about the integers since child- 
hood. Despite that familiarity, we must list the 10 basic axioms for Z in order to be able 
to prove anything about Z. By an axiom, we mean a basic unproved assumption. We must 
deduce everything we say about Z from our 10 axioms - forgetting what we know from 
elementary school. In Section 5.3 we will find that much of what we do here - especially in 
the pure algebra part (R1 to R6) - works for any integral domain and not just Z. Sometimes 
Z* or Z* U {0} is referred to as the “natural numbers.” This seems somewhat prejudicial 
to the other types of numbers one may use and so we will try to avoid that terminology. 
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Algebra Axioms for Z, 


For every n,m€ Z there is a unique integer m+n and a unique integer n- m such that 
the following laws are valid for all m,n, k € Z. This says the set of integers is closed under 
addition and multiplication. 


R1 Commutative laws: m+ n=n+m and m-n=n-m. 
R2 Associative laws: k+ (m+n)=(k+m)+nandk-(m-n)=(k-m)-n. 
R3__ Identities: There are two special elements of Z, namely 

O (identity for addition) and 1 (identity for multiplication) in Z 

such that 0 + n=n, 1-n=n, for all ne Z, and 0 1. 
R4 Inverse for addition: For every me€ Z there exists an element 

x €Z such that m + x=0. Write x= —m, once you know + is unique. 
R5 Distributive law: k-(m+n)=k-m+k-n. 
R6_ No zero divisors: m -n=0 implies either m or n is 0. 


We sometimes write n -m=n* m= nm. Thanks to the associative laws, we can leave out 
parentheses in sums like R+ m+n or in products like kmn. Of course we still need those 
parentheses in the distributive law. 

As a result of axioms R1-R5, we say that Z is a “commutative ring with identity for 
multiplication.” As a result of the additional axiom R6 we say that Z is an “integral domain.” 
Rings will be the topic for the last half of this book - starting with Chapter 5. 


Exercise 1.3.1 


(a) Show that the identities 0 and 1 in R3 are unique. 
(b) Show that the inverse x of the element m in R4 is unique once m is fixed. 


Exercise 1.3.2 Show that a-0=0 for any a €Z. 


Exercise 1.3.3 In axiom R4, we can write 1+ u=O and then define u=—1. Show that 
then, for any m€Z, if x is the integer such that m+x=0, we have rx=(—1)-m. Thus 
x= —m=(-1)-m. Prove that —(—m) =m. 


Exercise 1.3.4 (Cancellation Laws). Show that if a,b,c € Z, then we have the following laws. 
(a) Ifa+b=a+c, then b=c. 

(b) If aA 0 and ab =ac, then b=c. 

Exercise 1.3.5 Prove the other distributive law: (m+ n)-k=m-k+n-k. 


Exercise 1.3.6 Prove that for any a,b € Z we can solve the equation a+ x=b for rE Z. 


Additional axioms for Z involve the ordering < of Z which behaves well with respect 
to addition and multiplication. The properties of inequalities can be derived from three 
simple axioms for the set P= Zt of positive integers. 
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Order Axioms for Z 
01 Z=PU{0}U(—P), where —P={-—x|xe€P}. Moreover this is a disjoint union. 
That is, 
O¢P, 0¢-P, PN(-P)=9. 


02 nme P=n+meP. 
03 nme P= >n-meP. 


As a result of the nine axioms R1-R6 and 01-03, we say that Z is an ordered integral 
domain. There is still one more axiom needed to define Z, but we discuss this below — after 
saying more about the order relation a< b. 


Definition 1.3.1 [f a,b €Z we say that a< b (b is greater than a or a is less than b) iff 


b—ae€P=Z". One can also write b >a in this situation. 


Examples. By this definition, the set P consists of integers that are greater than 0. We can 
see that 0 <1 since otherwise, by 01, 0< —1. But then, according to axiom 03 it follows 
that (—1)(—1) =1€ P. This contradicts PM (—P) = 9. 

It follows from our axioms that P=Zt = {1,2,3,4,...}, using 02 and the last axiom 
(well-ordering) which we are about to state. This axiom will allow us to prove infinite lists 
of statements by checking two items (mathematical induction). See Exercise 1.3.13. A 


We can use our axioms to prove the following facts about order. 


Facts about Order. V x, y,z,ceE Z 


(1) Transitivity. x<y and y<z implies x<z. 
(2) Trichotomy. For any x, y,z€ Z exactly one of the following inequalities is true: 


x<y, Y<xX, or L=y. 


(3) Addition. r< yimplies r+ z<y+z for any zE€ Z. 
(4) Multiplication by a positive number. If 0 <c and x<y, then cr< cy. 
(5) Multiplication by a negative number. If c<0 and x< y, then cy <cx. 


Proof. We will leave most of these proofs to the reader as an exercise. But we will do (1) 
and (3). 

Fact (1): x<y means y—xEP. y<z means z—y€P. Then by 02 and the axioms for 
arithmetic in R, we have y — + +z—y=z-—-.4€ P. This says x <z. 

Fact (3): Since x<y we know that y— xe P. Then (y+ z) — (x +z)=y—2x€P which is 
what we needed to show. A 


More Definitions. Of course we will write a< bif either a =b or a<b. We may also write 
b >a in this case. 


Exercise 1.3.7 Prove the rest of the facts about order. 
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Exercise 1.3.8 Show that x? + 1=0 has no solution x €Z (or in any ordered integral domain 
really). This says that ifi= \/—1, the domain of Gaussian integers Z|i] = {x + iy | x, ve Z} 
cannot be ordered - at least, not without dropping some properties of < on Z. 


Exercise 1.3.9 By an ordered integral domain we mean a set satisfying all the axioms R1-R6 
and 01-03. 
(a) Show that there is no ordered integral domain D with 2 elements. 


(b) Show that there is no finite ordered integral domain. 


The set of real numbers R also satisfies axioms R1-R6 and 01-03. In fact, it is what we 
call an ordered field. We will consider fields in Section 5.3. 


Exercise 1.3.10 Define the absolute value |x| =x if xe R and x>0, and |x| =—x, ifxe R 
and x< 0. 


(a) Show that |xy| = |x| |y| for all x,yER. 
(b) Prove the triangle inequality |x + y| <|2|+ |y|, for all x,yER. 


Hint. Use the fact that |a| = Va?. 


Exercise 1.3.11 Prove that the following inequality holds for all a,b € R such that a and b 
are both positive, 


a+b 


> Vab. 


This is called the arithmetic-geometric mean inequality. The left-hand side is the arith- 
metic mean and the right-hand side is the geometric mean of a and b. The inequality can 
be generalized to an inequality involving n positive real numbers. We will consider that 
generalization in a later exercise. 


Because the creation of the real numbers involves limits - which reside within the domain 
of advanced calculus, we will not have much to say about this creation here. However we 
must still come to grips with the infinite, for example, infinite lists of theorems to prove. 
One thing that differentiates Z from the real numbers R or the rational numbers Q is the 
following well-ordering axiom. It really embodies the discreteness of Z, as opposed to the 
continuity of R. 


The Well-Ordering Axiom. If SC Z*+, and SA9Q, then S has a least element a€ S such that 
a<ua, VreS. 


This axiom says that any non-empty set of positive integers has a least element. We 
usually call such a least element a minimum. 

Note that you could state a similar axiom for Z* U {0} or Zt U {0, —1}, or the union of 
Z* and any finite set of integers. 

Now we have our 10 axioms for Z. We could do with fewer axioms. G. Peano (1858-1932) 
wrote down the five Peano postulates (or axioms) for the natural numbers Z* U {0}. We 
will not list the Peano postulates here. See, for example, Birkhoff and MacLane [9]. Once 
one has these axioms, it would be nice to show that something exists satisfying the axioms. 
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We will not do that here - feeling pretty confident that you believe Z exists. See the comic 
book about Russell, Logicomix [26] by A. Doxiadis and C. Papadimitriou for a story of the 
writing of Principia Mathematica by Russell and Whitehead. One of the big events in the 
book is getting to the point to deduce that 1+ 1=2. 

It follows from all these axioms and the facts deduced from them that Z= 
{0,+1,+2,+3,+4,...}. Thus we can picture the integers as forming a line of equally 
spaced points stretching out to oo on the right and —oo on the left. See Figure 1.5, which 
shows part of the discrete set of integers embedded in the continuous real line. 


@—_e—_e—_e_e_o_e_0_0_0_0_0_0_0_00—® 


-8 -6 -4 -2 0 2 4 6 8 


Figure 1.5 Integers on the line - only even ones are labelled 


Now that we have listed all the axioms for Z, we should at least mention that one could 
also list the defining axioms for Q and R. Basically Q consists of fractions *, for m,n € Z, 
with n£0. The rules for adding and multiplying fractions are the usual ones. See Defini- 
tion 6.4.1 if you forget. We will assume in many exercises that you know these things. Of 
course Q has an ordering satisfying the same rules as that for Z. It is what we will later call 
an ordered field. The real numbers R have an extra axiom that means that Cauchy sequences 
converge. A Cauchy sequence {x,} of real numbers has the property that |x,, — x,| > 0, 
as m,n—> co. One may view the real numbers as limits of Cauchy sequences of rationals. 
For concreteness, view the real numbers as decimals. You can of course add and multiply 
real numbers - also divide by nonzero ones. There is an ordering with the same rules as 
that for Z. Most advanced calculus books discuss these things at length. Ordinary calculus 
books are based on these rules. The well-ordering axiom is false for subsets of the positive 
real numbers R*. There is no smallest element of the open interval (0, 1), for example. 
Similarly (0, 1)M Q has no smallest element. We will say no more of this. 

The following exercise may appear obvious from Figure 1.5, but we need to prove (using 
only our axioms) that the properties embodied in that figure are indeed true properties of Z. 


Exercise 1.3.12 Show that there is no largest integer N such that VxE Z, we have x<N. 


Exercise 1.3.13 
(a) Show that there is no integer a such that 0 <a< 1. 
(b) Deduce that then the set of positive integers 
PSG {12,3 Any aE Ly oak 
Thus, in particular, no sum of 1s can be 0. 


Hint. 


(a) Consider the least such a and deduce a contradiction by considering the location of a? 
with respect to 0 and a. 

(b) First explain why there is no integer a such that 1<a< 2. Then explain why there is 
no integer a such thatn<a<n+1, for any n=2,3,4,.... 


The most important fact about the well-ordering axiom is that it is equivalent to 
mathematical induction. We discuss that in the next section. 
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Exercise 1.3.14 Show that ifa¢ Z and a<0, then a’?>0. 


1.4 Mathematical Induction 


Domino Version of Mathematical Induction. Given an infinite line of identical equally 
spaced dominos, we want to knock over all the dominos by just knocking over the first one 
in line. To be able to do this, we should make sure that the nth domino is so close to the 
(n+ 1)th domino (and so similar in weight) that when the nth domino falls over, it knocks 
over the (n + 1)th domino. See Figure 1.6. 


Figure 1.6 The first principle of mathematical induction. A penguin surveys an infinite line of 
equally spaced dominos. If the nth domino is close enough to knock over the (n + 1)th domino, then 
once the penguin knocks over the first domino, they should all fall over 


Translating this to theorems, we get the following. 


Principle of Mathematical Induction | 
Suppose you want to prove an infinite list of theorems T,, n=1,2,... It suffices to do two 


things. 


Step 1. Prove Tj. 
Step 2. Prove that T,, true implies T,,,, true for all n> 1. 
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Note that this works by the well-ordering axiom. If S={n€Z*|T, is false}, then either 
S is empty or S has a least element q. But we know q> 1 by the fact that we proved 
T; - which was Step 1. And we know that T,_; is true since q is the least element of S. 
But then, by Step 2, we know T,_; implies T,, contradicting q € S. It follows that S must 
be empty - meaning that all our T,, are true. 

The assumption that T,, is true in Step 2 is often called the induction hypothesis. 


Example: Formula for the Sum of an Arithmetic Progression. T,, is the formula used by 
Gauss as a youth to confound his teacher: The teacher had asked the students to sum the 
positive integers < 50 or so. 


n(n + 1) 


, n=1 
2 


Ta 1+24---4n= 


To do this proof, we assume familiarity with the axioms for rational numbers. Sorry, we 
do not discuss fractions until Section 6.4 but surely you know how to add and multiply 
fractions already. If not, see Definition 6.4.1. A 


Proof. We follow our procedure for Mathematical Induction I. 
First, prove T;. 1=1(2)/2. Yes, that is certainly true by the rules for identifying 
fractions. 
Second, assume T,, and use it to prove T,+4,, for n=1, 2, 3,... 
n(n + 1) 


Th: Leger aS (1.1) 


Add the next term in the sum, namely, n + 1, to both sides of the equation and obtain 


meer) 


14+2+---+n4+(n4+1)= + (n+ 1). 


Finish by simplifying the right-hand side of this last equality using our axioms for Z plus a 
bit of knowledge of the distributive law for Q and the rule for adding fractions. You obtain 


mnt) 4s (ntt)= (n +1) (F+1)=(n+0) (5), 


which gives us equation (1.1) with n replaced by n + 1; that is, formula T,,; 1. This completes 
the proof using Mathematical Induction I. A 


One may find this proof a bit disappointing. Many students have complained at this point 
that they are not convinced of the truth of the formula for the sum of an arithmetic progres- 
sion. Induction does not seem to reveal the underlying reason for the truth of such a formula. 
Some have even complained that this proof requires that one believe in mathematical 
induction. Well, yes, it is equivalent to an axiom and we have to believe it. 

Of course, there are many other proofs of this sort of thing. For example, look at 


Lt + 2 ees ae 
n+ n-1l + +++ + 1 


When you add terms in the same column you always get n+ 1. There are n such terms. 
Thus twice our sum is n(n+ 1). However, this proof still does not reveal any insights as 
to how to do the next exercise or to generalize the formulas to sums of arbitrary integer 
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powers 1* + 2* +... +-n*. We will not go into such methods here - as our only interest in 
these formulas is to give us practice in the use of mathematical induction. 


Exercise 1.4.1 Use mathematical induction to show that 
n(n + 1)(2n+4 1) 
3 . 


You will need to use the rules for adding fractions. See Definition 6.4.1. 


Poa. 4tyra 


; : : 1 are : 
Exercise 1.4.2 Consider the matrix A= (( 4 , for a€R. Use mathematical induction to 


1 na 


show that A" = 
Oo 1 


) fr alln eZ. 


We also have the concept of inductive (or recursive) definition. For example, to define 
n factorial, written n!, we define 0! = 1 and, assuming that nm! is defined, then we define 
(n+ 1)!=(n-+ 1)n!. Of course this means that n! = n(n — 1)(n—2)---2- 1 is the product 
of all integers between 1 and n. This number is the number of permutations or rearrange- 
ments of a set of n objects. To see how many ways there are to arrange n elements in a row, 
note that there are n choices for the first element in the row, n — 1 choices for the second, 
and so on until you reach the last element in the row of n elements, for which you have 
one choice. 


There are times when the first induction principle is not precisely what is needed. For 
that we need the second principle - also called strong induction or complete induction. 


Principle of Mathematical Induction II 


To prove an infinite list of theorems 


dy, dz, d3,...,dn, dn4yi,... 


you need to do two steps. 


Step 1. Show that d is true (or if necessary some finite number of d; are true). 
Step 2. Show that for every n> 1, the truth of d,, d,,d3,...,d, implies the truth of d,,4,. 


Exercise 1.4.3 Show that the well-ordering axiom implies the second principle of mathe- 
matical induction. 


The first and second principles of mathematical induction are both equivalent to the well- 
ordering principle. The fact that seemingly different principles are equivalent may appear 
to be surprising - welcome to the world of algebra. 


Mathematical Induction II in Pictures. Step 2 says you have to organize the dominos so that 
if we knock over all the first n dominos, then the next domino must fall. 

The moral is that, if your dominos are not all in a line, depending on the way the dominos 
are organized, it may take more than one domino to knock over the next one. In Figure 1.7 
you are supposed to need a variable number of dominos to knock over the ones to their 
left - and again all dominos are supposed to be the same size and weight. Perhaps we need 
some real dominos to make a better picture - or a better artist. 
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In the examples, sometimes you know you need d,_; and d, to knock over d,1, but 
other times you may not know how many d; with j< n+ 1 you need to knock over d,,4,, 
as in the proof of the fundamental theorem of arithmetic in the next section. 


Figure 1.7 Here is an attempt to 
picture the second mathematical 
induction principle in which we 
arrange dominos so that various 
numbers of dominos are needed 
to knock over the dominos to 
their left. In this picture step 1 
would be for the penguin to 
knock over d, and d, 


Example. Fibonacci numbers f, are defined inductively by setting 

fi=fo=1 and fri =fn-1t+Sn- (1.2) 
The first few Fibonacci numbers are 

1, 1, 2,3,5,8, 13, 21,34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765. 
We could also have started with fo=0, fi = 1. A 


Exercise 1.4.4 Show that fy < 2". 


Hint. You will need the second mathematical induction principle. Use the results for n — 1 
and n to prove the result for n+ 1. 
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History. The numbers f,, in the preceding exercise are named for Fibonacci (1180-1228), 
aka Leonardo of Pisa, who used these numbers to model the number of pairs of rabbits 
on an island supposing one pair of baby rabbits is left on the island at the beginning 
and assuming no rabbit dies - ever. Each newborn pair takes two months to mature and 
produces a new pair in the third month and in every month thereafter. Mathematicians in 
India may have considered these numbers before Fibonacci. These numbers are so popular 
that there is even a journal devoted to them. 

The Fibonacci numbers appear in many contexts - in pineapples, sunflowers, the family 
tree of a drone or male bee, the optics of light rays. For a discussion of such things, see 
the book on the golden ratio by Mario Livio [70]. We will look at the connection of the 
Fibonacci numbers and the golden ratio ¢ = z(1 +V/ 5) in Section 8.1. 


Exercise 1.4.5 Use mathematical induction to prove that given n sets B,,..., By, and another 
set A, we have the generalized distributive law: 


AN (Us) =(J(ana). 


i=1 
Hint. We are assuming Exercise 1.2.2 which says that the operation of union is associative 
as well as the extension to drop parentheses in unions of arbitrary numbers of sets. You will 
also need Exercise 1.2.3 in the case n=2. Then, of course, mathematical induction does 
the rest. 


Exercise 1.4.6 What is wrong with the following “proof” by induction? 

We claim we can show that in any room of n people all have the same birthday. 

This is clear when n= 1. 

To show that the case n—1 implies the nth case, note that if a room has n people, 
we can send person A out. Those left are n — 1 people, each having the same birthday by 
the induction assumption. Now bring person A back and send person B out. Then, by the 
induction assumption, A must have the same birthday as the rest of the people in the room. 
So all have the same birthday! 


In the following exercises and throughout this text we use the summation symbol nota- 
tion hee dp =a, +-+-+ 4, and we replace n by co to mean that we have taken the limit 
as N—> co. 


Exercise 1.4.7 (Geometric Progression and Series). 


(a) Show using mathematical induction that if x¢ R and x61, then 


(b) Then show that if |x| < 1, assuming you believe in limits, 


= 1 
De 
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Exercise 1.4.8 Use mathematical induction to prove that there are n! ways to rearrange a 
row of n penguins. 


co 
The following exercise implies the divergence of the harmonic series S- re 
k=1 


Exercise 1.4.9 Prove that for n= 1,2,3,... 


Exercise 1.4.10 Prove that if f, denotes the nth Fibonacci number defined in formula (1.2), 


with fo =0, and A= . : 


formula (1.3) in Section 1.8 and the statement that follows it: 


, then, with the usual matrix multiplication as defined in 


QA gaara (Me rae for n> 1. 
——- — = 
n terms = — 


1.5 Divisibility, Greatest Common Divisor, 
Primes, and Unique Factorization 


Number theory is an ancient subject which involves the study of various properties of 
the integers and generalizations thereof. One of the most fundamental properties is divis- 
ibility. This leads to the concept of prime number - a subject that is full of challenging 
questions that are easily stated but not easily answered. Here we present some of the 
basics of the subject. Some references for more information are: Ramanujachary Kuman- 
duri and Cristina Romero [62], Steven J. Miller and Ramin Takloo-Bighash [78], Kenneth 
Rosen [91], Daniel Shanks [103], Joseph H. Silverman [105], and Harold Stark [110]. One 
of the charms of number theory is that one can easily do experiments, especially now 
that computers are ubiquitous. However, there are many cautionary tales of false conjec- 
tures that hold true for large numbers of cases. We gave some examples in the exercises of 
Section 1.1. 


Definition 1.5.1 Suppose that a and b are integers. We say a divides b, written a\b, if 


there is an integer c such that b=ac. We will also say that a is a divisor of b or b is a 
multiple of a. We could also say b is divisible by a. 


Examples. The set {+1,+2,+3,+4,+6,+12} consists of all the divisors of 12. Every 
integer divides 0. But 0 divides none but 0. A 


At this point, in order to find all the divisors of an integer n, we need to test all the 
integers m with 0 < m<|n| to see if m divides n. We will have a better way soon. 


Exercise 1.5.1 Show that if both a|b and b|a, then a=+b. 


19 


20 


Part | Groups 


Definition 1.5.2 An integer p> 1 is prime iff p = ab for integers a, b implies either a or 


b is +1. 


Thus a prime p is a positive integer greater than 1 with no positive divisors but itself 
and 1. Note that 1 is not a prime by definition. Yes, it has no non-trivial divisors but it is 
special in a different way. It is its own multiplicative inverse and is thus called a unit in 
Z - a concept that will be important in ring theory. See Definition 5.2.3. 


Examples. The first few primes are 2, 3,5, 7, 11,13, 17, 19, 23,29, 31. Mathematica has a 
command that will tell you what is the nth prime. The size of n to plug into that command 
depends on your computer. A 


Exercise 1.5.2 Show that ifne Z and n>1, then n has a prime divisor. 


Hint. Use the second induction principle. 


Exercise 1.5.3 Show that the preceding example of the primes <31 is correct. Do not use 
a computer. 


Exercise 1.5.4 Show that there are infinitely many primes. 


Hint. Do a proof by contradiction and assume that there are only finitely many primes. 
Call them p,,p2,.--,Pn- Consider the number M=1+ pipo-:-pn. Is M divisible by 


any pj? 


Somewhere in the distant past you learned to do long division. So, for example, you 
divide 4 into 31 and get quotient 7 and remainder 3. This means 31 =4 « 7 + 3. This can 
be tabulated as: 


Note that if we define the floor of x to be |x| = the largest integer <x, then the quotient 
here is | 3 | =7 and the remainder is then 3=31-—4|7']. 


Exercise 1.5.5 Use the well-ordering principle to explain why |x| exists. 


Hint. Look at {ne Z | n> —+}. 


Next we state a theorem whose proof justifies our knowledge that we can do long division. 
You might think us a bit crazy for proving this - especially if you have thought about the 
formula involving |x| in the last paragraph and exercise. Later when we use a similar proof 
to show that we can divide polynomials, you might forgive us - or not. 
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Theorem 1.5.1 (The Division Algorithm). Suppose that a,b are non-negative integers with 
b >0. Then there are (unique) integers q (quotient) and r (remainder) such that a= bq +r 
andO<r<b. 


Proof. We use the well-ordering principle. Look at the set S of non-negative integers of 
the form a — bn, for some ne Z. We know that S is not empty, since we can take n=0, 
as we are assuming that a> 0. So now let r be the smallest element of S (which exists by 
the well-ordering axiom). We claim that r has the properties stated in the theorem. By the 
definition of S, we know 0<r=a — bn, for some né€ Z. If (by contradiction) r> b, then 
r—b=a—bn—b=a-—(n+ 1)beES, contradicting the minimality of r. We leave it as an 
exercise to show that q and r are unique. A 


Exercise 1.5.6 Prove the uniqueness of quotient and remainder in the division algorithm. 


Exercise 1.5.7 Write an alternative proof of the division algorithm using the floor function 


to write q=|¢| and r=a— bq. 


Exercise 1.5.8 Extend the division algorithm to negative integers a. For example: —5= 
2 * (—3) +1. Herea=—5,b=2,q=-3,r=1. 


Note that b divides a means that when we use the division algorithm to divide a by b, the 
remainder is 0. This terminology may seem somewhat confusing and so you might want to 
say b divides a evenly, but we will not. 


Definition 1.5.3 A positive integer d is called the greatest common divisor (gcd) of two 
integers a,b, where a and b are not both 0, written d= (a, b) = gcd(a, b), if the following 
two properties hold: 


(1) d divides both a and b, 
(2) if an integer c divides both a and b, then c must divide d. 


Definition 1.5.4 If gcd(a, b) = 1, we say that a and b are relatively prime. 


The existence of the greatest common divisor of two integers is not obvious until you 
have the Euclidean algorithm below. Once you have it, you can see that the greatest common 
divisor is just what the name says it is - the greatest of all common divisors of a and b. To 
see this, suppose that d’ is the largest of all common divisors of a and b while d=gcd(a, b) 
from Definition 1.5.3. Certainly then d< d’. But, by part (2) of the definition of d, since d’ 
is a common divisor of a and b, we know that d’ must divide d. This implies that d’ < d. 
Thus d' =d. 

For very small numbers it is easy to find gcd(a, b) by factoring a,b. For example if 
a is prime, then gcd(a, b) must be either a or 1. It is a only if a|b. However, we should 
be careful at this point because we have not yet proved the fundamental theorem of 
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arithmetic - Theorem 1.5.3 - which tells us we can factor integers as products of primes 
uniquely up to order. Thus when we say gced(10,25) = ged (2-5,5-5) =5, we are assuming 
the existence and uniqueness of the factorization. This will create what is called circular 
reasoning since we want to use the existence of the greatest common divisor to prove 
the fundamental theorem of arithmetic. Thus it is important that we devise a way to com- 
pute the greatest common divisor that does not require unique factorization. That is the 
Euclidean algorithm we are about to describe. 


Examples. 1=ged(0, 1), 1=ged(3, 2), 5= ged(10, 25), 1 =gcd(37, 5). A 


Exercise 1.5.9 Compute ged(13, 169) and ged(11, 1793). 


The Euclidean Algorithm. To compute gcd(37, 5) using the Euclidean algorithm, one com- 
piles a list of divisions. The gcd will be the last nonzero remainder. We know that the list 
of remainders must end in 0 since it is strictly decreasing. 


37 = 5:7 + 2 
5 = 2:2 + 1 
2 = 1:2 + 0 


So ged(37,5) = 1. 

It is not hard to check that the last nonzero remainder must divide all the preceding 
remainders and thus both 37 and 5. This is done by reading the preceding list from bottom 
to top. Moreover, if c is a common divisor of 37 and 5, we see that c must divide all the 
remainders by reading the list from top to bottom. 

We can also use the Euclidean algorithm from bottom to top to write the gcd (a, b) as an 
integer linear combination of a and b: 


1=5-2*2 from row 2 
1=5-—(37—5x*7)*2 from row 1. 


Thus 
1=5x* (14 14) —37*2=5*15—37%2. 


Exercise 1.5.10 Compute d= gcd(17, 28) using the Euclidean algorithm. Then find integers 
m,n such that d= 17m + 28n. 


Exercise 1.5.11 Write out the general statement of the Euclidean algorithm for two positive 
integers a and b, and then prove it. How do we know that it ends after a finite number of 
steps? 


Hint. Suppose that b <a. To find gcd(a, b) perform the following divisions: 


a=ba +n, where 0< 17, <b 

b=ng,+h, where 0< 1, <1, 

Ty =1G3 41s, where 0< 173 <1, 

Tn—2 =Tn-19n + Tn, where 0< fq < T_1 

Tn—1 =TnGn41 + 0, so that fn = last nonzero remainder. 


You need to show that r, =gcd(a,b). It is understood that if r, =0, then gcd(a, b) =b. 
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The Euclidean algorithm leads to an identity named after the French mathematician 
E. Bézout, who was not really the first to prove it for the integers. Instead, he discovered 
the result for polynomials which we will describe in Section 5.5. 


Theorem 1.5.2 (Bézout’s Identity) If a and b are integers, then d= gcd(a, b) =na + mb 
for some integers m,n. Moreover, d is the smallest positive integer which is an integer 
linear combination, na + mb, of a and b. 


Proof. We could prove this theorem constructively by reading the Euclidean algorithm 
backwards. But let us instead give a non-constructive proof which does not mention the 
Euclidean algorithm. 

We might as well assume a, b are both positive. Look at the set of integers 


S={k>0 |k=na+mb, n,meZ}. 


Note that S is not empty and thus has a least element q by the well-ordering principle. 
In fact, q is the gcd(a, b). To see this, note that since q is an element of S, it is clear that 
any common divisor of a and b must divide q. To see that q divides a, use the division 
algorithm to write a=qc+ r, with O<r< q. But then r=a — qc is in S. Thus r must be 0 
by the minimality of q. So q|a. Similarly q|b. Thus q = gcd(a, b). A 


Exercise 1.5.12 Fillin the details in the proof of the preceding theorem. For example, explain 
why S is non-empty and re S. 


Exercise 1.5.13 Use the Euclidean algorithm to compute gcd(163, 1001) and gcd(163, 1141). 


Lemma 1.5.1 (Euclid’s Lemma) Suppose that a and b are integers and let p be a prime. If p 
divides ab then either p divides a or p divides b. 


Proof. Suppose that p does not divide a. Then 1 = gced(a, p). Why? It follows from Theorem 
1.5.2 that 1 =na-+ mp for some integers n, m. Multiply this equality by b. That gives b= 
nab + mbp. Since, by hypothesis, ab= pc, for some integer c, this means b= p(nc + mb) 
and p divides b. A 


We will need part (a) of the following exercise (proved by induction) in the proof of the 
next theorem. 


Exercise 1.5.14 


(a) Prove that if prime p divides a product a,az--- a, then p must divide a; for some j. 
(b) Prove that if p is a prime and p does not divide the integer n, then gcd(p, n) = 1. 


Theorem 1.5.3 (The Fundamental Theorem of Arithmetic). Every positive integer n>1 
factors uniquely (up to order) as a product of primes. 
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Proof. Here we sketch only the uniqueness and leave the existence of the factorization as 
an exercise for the reader. Both parts of the proof require the 2nd principle of induction. 
Step 1. We start with n =2. In this case, the factorization is unique as 2 is a prime. 
Step 2. Our induction assumption says: each integer m with 2<m<_n has a unique prime 
factorization. We must use this to show that then so does n + 1. 

So suppose that n + 1 has two prime factorizations 


n+1=pip2--:Pu=4142°-- QW. 


Here all the p; and q; are primes (not necessarily distinct). By part (a) of Exercise 1.5.14, ifa 
prime divides a product, then it must divide one of the factors. So we know that the prime 
p; must divide the prime q; for some j. But since the only positive divisors of the prime 
q are 1 and qj, it follows that p; = q;. This means we can divide p; = q; out of both sides 
and obtain two distinct factorizations for the smaller number (n + 1)/pi. But that is not 
possible by the induction assumption: that is, all the remaining primes must also coincide, 
and n+ 1 has a unique factorization. A 


Exercise 1.5.15 Complete the proof of the preceding theorem by proving the existence of the 
factorization. 


Euclid proved the existence part of the fundamental theorem. The uniqueness was not 
proved until Carl Friedrich Gauss (1777-1855) wrote his book Disquisitiones Arithmeticae 
in 1798 when he was 21. The uniqueness was perhaps considered obvious. However, when 
people attempted to prove Fermat’s last theorem using arithmetic in more general rings 
like Z[e27‘/"| whose elements are polynomials with integer coefficients in the nth root of 
unity ¢, = e7/", it was soon learned that unique factorization can and does fail for large 
enough values of n: for example, n= 23. We call ¢, = &7!/" an nth root of unity because it 
is a root of the polynomial x” — 1. 


Exercise 1.5.16 State whether the following statements are true or false and give a brief 
explanation of your answer. 


(1) If a,b, r€ Z* and r divides ab, then either r divides a or r divides b. 
(2) Fora,beéZ* if a divides b and b divides a, then a=b. 


Exercise 1.5.17 Find all the positive divisors of 24,36, and 81. 


Exercise 1.5.18 Suppose the prime factorization of the integer n>2 is n=p\'--- pS, with 
exponents e; > 0 and pairwise distinct primes p;. Write down an expression for any positive 
divisor of n. Then give a formula for the number of positive divisors of n. 


The following exercise would upset any Pythagoreans - should there be any existing 
today. The Pythagoreans formed a secret society around 550 sc. They believed: “Natural 
numbers and their ratios rule the universe.” See Edna E. Kramer [59, Chapter 2] for more 
information on them. Thus the Pythagoreans were OK with fractions but they drew the line 
at the next step - creating \/2 - even though they loved the right triangle with sides of 
length 1 and hypotenuse of length /2 - using the Pythagorean theorem. Supposedly they 
would kill you for revealing the irrationality of \/2 to someone. Mercifully they disappeared 
thousands of years ago. 
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Exercise 1.5.19 Prove that V2 is irrational: that is, not of the form a, wherem,neZ, with 
n~0 and gcd(m,n)=1. The object is to derive a contradiction to the assumption that m 
and n are relatively prime using the equation \/2 = ”. You need to square both sides and 
note that squares of odd numbers are odd while squares of even numbers are even. Once 
you see how to do \/2, go on to prove W2 is also irrational, for all k= 2,3, 4,5,6,... 


Exercise 1.5.20 Factor the following numbers. Feel free to use Mathematica or your favorite 
computer program. 


(a) 31415 
(b) 314159265 
(c) 314159265 358 979 


In the 1970s I went to number theory conferences that held factoring competitions 
using programmable calculators. So I had to include the last problem. Now many types 
of computer software will do this factorization. Mathematica works for me. I admit that 
I did not take part in these calculator races of the 1970s. Very clever methods must be 
used to factor large numbers - especially if they are products of two large primes. The 
principle that underlies some cryptographic systems is the difficulty of factorization. See 
Section 4.1. 

Figure 1.8 is an ArrayPlot in Mathematica of the matrix of values gcd(m, n) for —50< 
m,n< 50. A different color is associated to each value with the color range pretty evident 
along the main diagonal. 


Figure 1.8 A color is placed at the (m,n) entry of a 101 x 101 matrix according to the value of 
gcd(m, n). This is an ArrayPlot in Mathematica 
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Exercise 1.5.21 Explain the lines seen in Figure 1.8. The command in Mathematica used to 
generate the figure was: 


ArrayPlot [Table [GCD [m,n] , {m,-50,50}, {n,-50,50}] ,ColorFunction 
->ColorData["BrightBands"], 
Frame->False, DataReversed->{True,False}]. 


Hint. The diagonal m=n clearly comes from gcd(m,m) =|m|. The color scheme chosen 
runs from red through blue, green, yellow, to orange. What about the other lines? For 
example, what happens to gcd(m,n) if m=2n? When m is a rational multiple of n, say 
m= an, what can you say about gcd(m, n)? 


Exercise 1.5.22 Suppose that a,b,c€Z* such that gcd(a,b)=1 and ab=c", for some 
n> 1. Show that then there are r,s ©Z* such that a=r" and b=s". 


In the following exercise we use the notation 
k 
WG =4,a, Ariss Ap. 
i=1 
Exercise 1.5.23 Show that if the p; are pairwise distinct primes, fori=1,...,k, then 


k k k 
gcd (Ih Io") = ][2?: where g;= min{e;, fi}. 
i=1 i=l i=l 


1.6 Modular Arithmetic, Congruences 

The standard way to begin the story of modular arithmetic - a story which goes back to 
the 1700s - is to think about clocks. 

Example: Clock Arithmetic 

Question. If it is 3 o’clock now, what time is it after 163 hours? 


Method of solution. Divide 163 by 12. Obtain the quotient 13 and the remainder 7. That 
is 163=13x 1247. 


Answer. It will be 10=(7 + 3) o’clock. 


What if you want to know if it is a.m. or p.m.? Then you should divide by 24 rather 
than 12. 
We say that 3 + 163 is congruent to 10 modulo 12 and write 


3+ 163=3 + 7= 10 (mod 12). A 


C. F. Gauss invented this notation. L. Euler (1707-1783) had introduced the idea earlier 
(around 1750). It gives us a new way to do arithmetic. This is modular arithmetic and is 
the foundation of many of our applications. 
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Definition 1.6.1 Fix a modulus m which is a positive integer. If a,b are integers, define 
a=b (mod m) (read as “a is congruent to b modulo m”) iff m divides (a — b). The 


relation a= b (mod m) is called a congruence. We will say that a and b are in the same 
congruence class [a] mod m. 


When considering integers a,b (mod m), we want to identify a and b when m divides 
(a — b). That is, we are going to glomp together all integers that are congruent to a mod m 
and call the result [a] a congruence class. This is the first example of a construction that 
we will see quite often in the rest of this book. See Definition 3.4.1 for example. 


Exercise 1.6.1 Suppose it is now 7 p.m. What time will it be after 101 hours? Is it a.m. 
or p.m.? 


Exercise 1.6.2 Suppose it is 5 p.m. at the airport (Lindbergh field) in San Diego now. What 
time is it when you get off your plane at Kennedy airport after you take a 5 hour flight to 
New York? 


Exercise 1.6.3 Prove that a=b (mod m) iff. and b have the same remainder upon division 
by m. 


Example. Let m=3. When we create the integers mod 3, we are taking the infinite line of 
integers and rolling it up into a triangle. A picture of this is given in Figure 1.9. 


1=-5=-2=4=7 (mod 3) 


=-4=-1=5=8 (mod 3) =-3=3=6=9 (mod 3) 


Figure 1.9 Rolling up the integers modulo 3 


There are three congruence classes of integers that we have glomped together mod 3: 
namely [0], [1] , [2]. Here [a] = {x | r= a (mod 3)}. The set of these congruence classes is 
denoted Z,. We will normally identify Z, with {0, 1,2} or - equivalently - with {—1, 0, 1}. 
We can then use ordinary addition and multiplication of integers to define a sum and 
product on Z3. 

Taking the modulus m= 12, you get a clock. Take the modulus m= 1, and you get one 
number; that is, Z, can be identified with {0}. A 
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Example: Addition and Multiplication mod 3. We add mod 3 by adding representatives of 
the congruence classes: [a] + [b] = [a+ b]. Thus 


0+ 1=1 (mod 3), 
1+ 2=3 =0 (mod 3), 
2+2=4=1 (mod 3). 
Multiplication is defined similarly: [a] - [b] =[a- b]. This gives, for example: 


2-2=4=1 (mod 3). 4 


With these definitions, you get addition and multiplication tables for Z3, which are also 
called Cayley tables, named for Arthur Cayley (1821-1895). Such tables first appeared in 
an 1854 paper of Cayley. In the first table, the top row and left column give the elements 
of Z3. Then the entry at row a, column 8, for a,b € Z; is a+ b (mod 3). The second table 
is analogous with + replaced by x. 


Addition table for addition in Z3 


+(mod 3) [fo ]1 | 2 

0 Oo;1)]2 

1 1|2 | 0=1+2 

2 2);0) 1=2+2 
Multiplication table for 
multiplication in Z3 

x (mod 3) || O | 1 | 2 

0) 0|;0/0 

1 Oo; 1/2 

2 0);2/1=2-2 


In the multiplication table, it would make sense to leave out the first row and first column 
since they only contain Os. Later - in Chapter 5 - we will learn that Z; is a ring, in fact, a 
field. But before Chapter 5, we will just say Z; is a group under addition, while Z; — {0} 
is a group under multiplication. 

More generally, we can define the integers modulo n, Z, to consist of the elements 
of Z identified under congruence mod n. Thus the elements are congruence classes 
[a|= {a+ nm|me Z}. And we define [a] + [b] = [a+ }] , [a] [b] =[ab]. One needs to check 
that these definitions give well-defined operations. This means that the sum or product 
is unambiguous. For example, in Z3, [1] + [2] = [3] =[4] + [5]=[9]=[0]. Yes, both sums 
are [0]. See the exercises below or Section 2.3 for more on this subject. With this notation, 
Z3 = {[0], [1], [2] }. Usually we leave out the [ ]. 


Exercise 1.6.4 Create the analogous addition and multiplication tables when m=7 and 8. 
We will have much more to say about these congruence groups. They are important 


for most of the applications we will discuss: for example, error-correcting codes and 
cryptography. 
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Exercise 1.6.5 

(a) Forne Zt, and a,b,c€Z, show that a= b (mod n) implies a+c=b +c (mod n). 

(b) If a,b,c and n are as in a) and d€Z, show that a=b (mod n) and c=d (mod n) 
imply that a+c=b+d (mod n). This means that addition in Z,, is well defined (i.e., 
unambiguous). 


Exercise 1.6.6 

(a) Forne Zt, and a,b,c€Z, show that a= b (mod n) implies ac= bc (mod n). 

(b) If a,b,c and n are as in (a) and d€Z, show that a=b (mod n) and c=d (mod n) 
imply that ac= bd (mod n). This means that multiplication in Z, is well defined (i.e., 
unambiguous). 


Exercise 1.6.7 Compute 51°°°°” (mod 4). Then compute 31°” (mod 4). You should get an 
element of the set {0, 1,2,3}. Note that you should not compute 5'°°°° or 3'°°°°7 which 
are humongous numbers. 


Exercise 1.6.8 


(a) Compute gcd(83, 38) =d using the Euclidean algorithm from the preceding section. 
(b) Use the result of part (a) to write d= 83m + 38n, with integers m, n. 
(c) Then use part (b) to solve 38x= 1 (mod 83). 


The following exercise is motivated by the preceding one and will be used in Section 2.5. 
Exercise 1.6.9 (Solving Linear Congruences). Suppose n€ Zt, and a,b €Z. Consider the 
linear congruence 

ax =b (mod n). 


The question is: when can this congruence be solved for x (mod n)? Prove that the answer 
is: when d= gcd(a, n) divides b. 


Hint. Use the Bézout identity (Theorem 1.5.2). 


Exercise 1.6.10 Suppose ne Z* and n is odd. Show that 
1 243 40+ + (= 1)=0 Uod x). 


Is this congruence still true for even n? 
Exercise 1.6.11 Solve 5x=1 (mod 163). 


One of the morals of this section is that you can often replace real numbers with elements 
of Z,. In particular, you can do linear algebra with R replaced by Zp. It is even better when 
n is a prime. 


Exercise 1.6.12 Solve the following pair of linear congruences simultaneously for 
x,y (mod 5) 


2x + 2y=1 (mod 5), 
3x + 2y=2 (mod 5). 
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Then rewrite the congruences as a matrix equation of the form Mv= b, where M is a2 x 2 
matrix with entries in Z,, while b and v are column vectors with entries in Z,. We multiply 
matrix times vector as in formula (1.3). 


Exercise 1.6.13 Find the inverse mod 7 of the matrices € and (; i where the 


3.4 
inverse of a matrix is defined as usual in linear algebra. See Example 3 in Section 2.3. 


Exercise 1.6.14 Show that if f, denotes the nth Fibonacci number from Section 1.4, then 
ged(fn, fn+1) =], 


Gallian [33, Chapter 1] gives many applications of modular arithmetic in everyday life: 
for example, in the assignment of check digits in Universal Product Codes read by optical 
scanners in large stores. 


Exercise 1.6.15 Solve the simultaneous congruences 


2x =1 (mod 5), 
3x =2 (mod 7) 
with x € Zs. 


The preceding sort of problem will be considered more generally in Section 3.6 under the 
heading of the Chinese remainder theorem - a result with many applications. 


1.7 Relations 


Many of the ideas that we have already discussed are examples of relations on the set Z: 
for example, a< b or a=b (mod m). The modern way to think of such things is as a subset 
of the Cartesian product Z x Z. 


Definition 1.7.1 A (binary) relation R on a set A is a subset of the set 


Ax A={(a,a’) | a,a’ € A}. 


We may write aRa’ if (a, a’) € R. It is also possible to have a relation R from set A to set 
B, which is a subset of A x B. 


Examples: Relations 


1. Arelation on Z is <. That is, the relation is the set of pairs (a, b) in Z x Z with a < b. Fig- 
ure 1.10 shows the relation using Mathematica to plot a 50 x 50 grid whose (i,j) square 
is turquoise if i<j and purple otherwise. We created the figure with the Mathematica 
command: 


ListDensityPlot [Table [i-j,{j,1,50},{i,1,50}], 
ColorFunction->(If[#>0,Purple,White] &), 
ColorFunctionScaling->False, InterpolationOrder->0] 


2. Another relation on Z is divisibility: that is, the pairs (a,b) in Z x Z with alb (ie. a 
divides b). Figure 1.11 shows the relation using Mathematica to plot a 50 x 50 grid 
whose (i,j) square is turquoise if j|i and purple otherwise. 
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3. Congruence (mod m) gives yet another relation on Z: that is, a= b (mod m). Figure 
1.12 shows the relation using Mathematica to plot a 50 x 50 grid whose (i,j) square is 
turquoise if i=j (mod 5) and purple otherwise. A 


Figure 1.10 Mathematica picture of the 
x<yrelation for the integers between 1 
and 50 


Figure 1.11 Mathematica picture of the 
y|x relation for the integers between 1 
and 50 


Exercise 1.7.1 Explain the lines in Figures 1.10-1.12. 


Definition 1.7.2 A relation R on a set S is an equivalence relation iff it has the following 
three properties. We will write a~ b instead of (a,b) € R. 


1. a~wa for all a€ S (reflexivity). 
2. a~b <= > bd~a (symmetry). 
3. a~band b~c=>a~c (transitivity). 
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Figure 1.12 Mathematica picture of the 
x= y (mod 5) relation for the integers 
between 1 and 50 


0) 10 20 30 40 50 


Next we want to decide whether the three relations in the preceding examples are equiv- 
alence relations on Z. The first two relations are considered in the exercises below. What 
about the third relation - congruence? 


Example. The congruence relation a =b (mod m) is an equivalence relation on a, be€ Z. 
To see this, we check the three properties of an equivalence relation. 


1. Reflexivity. Certainly a=a (mod m) since m divides 0 =a — a. 

2. Symmetry. If a=b (mod m) then a — b=km for some integer k,and then b — a= —km, 
which implies that b=a (mod m). 

3. Transitivity. Assume a=b (mod m) and b=c (mod m). Then a — b=km and b — c= 
k’m. It follows that 


a—c=a—b+b-—c=km+k’m=(k+k)m. 
Therefore a=c (mod m). 
Thus a=b (mod m) is indeed an equivalence relation on Z. We will see many general- 


izations of this equivalence relation when we envision quotient groups and quotient rings 
in Sections 3.4 and 5.4. A 


Exercise 1.7.2 Show that a<b is not an equivalence relation on Z. Can you see this from 
Figure 1.10? 


Exercise 1.7.3 Show that a|b is not an equivalence relation on Z. Can you see this from 
Figure 1.11? 


Definition 1.7.3 Suppose a~b denotes an equivalence relation on a set S. Define the 
equivalence class of ac S to be 


[a] = {be S|b~a}. 
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Definition 1.7.4 A partition of a set S is a collection of non-empty pairwise disjoint 
subsets S;C S whose union is S. More precisely, we mean that 


s=Js;, with S$; S;=0, Vi and j such that i4j, and S,A0, Vj. 
j 


Equivalence relations are closely connected with partitions of sets as we shall see in the 
next theorem. 


Examples. Consider congruence modulo 3 as an equivalence relation on Z. There are only 
three equivalence classes 

[0] = {all integers which are divisible by 3} = {3n| n€Z} 

= {a€Z | a=0 (mod 3)}; 

[1] = {all integers with remainder 1 when divided by 3} 

={1+3n|ne€Z}={aeZ| a=1 (mod 3)}; 
[2] = {all integers with remainder 2 when divided by 3} 

={2+3n|ne€Z}={aeZ| a=2 (mod 3)}. 

Note that we have a partition of Z into the three equivalence classes mod 3: 


Z=(0] VU [1] U[2], fo] N[1J=9, fo] N[2]=0, [1] N [2] =o. A 


Theorem 1.7.1 Equivalence classes from an equivalence relation ~ on a set S give a 
partition of S. Conversely, given a partition of a set S as a union of pairwise disjoint 
non-empty subsets S;: 


S=J5S;, with S:7 $;=0, when iFj, 
ff 


we can define an equivalence relation on x, ye S by saying x~ y iff both x and y lie in 
the same subset Sj, for some j. 


Proof. => As usual, we write [a] = {x € S | x~a} for the equivalence class of a€ S. 

Why are the equivalence classes non-empty? By reflexivity we know a€ [a]. 

Why is S a union of the equivalence classes? Every a€ S is in the equivalence class [a] 
by reflexivity. 

Why are the classes pairwise disjoint? If c € [a] M [b], then c~a and c~ b. By symmetry, 
then a~ c and c~ b. So by transitivity, a ~ b. This implies if x € [a] then x~ a and a~b, so 
x~b and thus x € [b]. Thus [a] C [b]. Similarly [b] c [a]. Therefore [a] = [b] if [a] N [bJA 0. 
<= We leave the proof of the converse as an exercise. A 


Exercise 1.7.4 Prove the converse part of the preceding theorem. 


Exercise 1.7.5 Define a relation on a,b€R by a~b — > a—DbeEZ. Show that this is an 
equivalence relation on R. Find a nice set of representatives for the equivalence classes. 


Now we consider another sort of relation which will appear repeatedly in our discussions. 
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Definition 1.7.5 A partial order denoted < ona set S is a relation that has the following 
three properties, for elements a,b,cé€ S: 


1. reflexive: a <a; 
2. antisymmetric: a< b and b<a implies a=b; 
3. transitive: a<b and b<c implies a<c. 


Examples 


1. Consider the set of subsets of a given set. Then set-theoretic inclusion is a partial order. 
2. If the set S= Z or R, then a<b is a partial order. 


3. If the set S= Z", divisibility a|b is a partial order. é 


Given a finite partially ordered set S, or poset one can make a diagram called a poset or 
Hasse diagram. The diagram consists of a set of vertices such that each vertex corresponds 
to an element of S. Then we draw a rising line between vertices a and b if a< b and there 
isnoce S withha<c<b. 


Example. Consider the poset diagram for the divisibility on the set of positive divisors of 
24. This is shown in Figure 1.13. A 


24 Figure 1.13 Poset diagram of the positive divisors of 24 


Exercise 1.7.6 Draw the poset diagram for the set of all subsets of {1,2,3,4} under the 
relation C. 


Exercise 1.7.7 Draw the poset diagram for the set of positive divisors of 30. 


Exercise 1.7.8 Draw the poset diagram for the set of positive integers <20 under the 
relation <. 


1.8 Functions, the Pigeonhole Principle, and Binary Operations 


We now have a new way to think about functions. Identifying a function with its graph, 
we see that a function is a special kind of relation. 
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Definition 1.8.1 Suppose A and B are sets. A function (or mapping) f : A — B is a 
relation from A to B such that every a€ A is the first entry of precisely one (a,b) € f; 
that is, 


(i) ifae A, 3 be B such that (a,b) cf 
(ii) if (a, b) Ef and (a,c) €f, then b=c. 


We write b= f(a) instead of (a,b) €f. 


Condition (ii) means that the function fis well defined: that is, f(a) is a unique ele- 
ment of the set B. No ambiguity is allowed. It is not possible to make a computation with 
ambiguously defined functions or group elements or sets or numbers. We will often call a 
function a “map,” which is short for mapping. 


Warning. Some algebra books (e.g., Herstein [42]) write af instead of f(a). This is often 


called Reverse Polish Notation. That old Hewlett-Packard calculator that I bought in the 
early 1970s used it. We will always write f(a) and not af. 


Definition 1.8.2 If f: A—+B is a function, then the set f(A) ={f(a) |a€A} is called 


the image of A under f. 


If f A—+B is a function, the image set f(A) is a subset of B. Later we will define a 
subset of A called the inverse image. 


Exercise 1.8.1 State whether each of the following equations is true or false and explain. 
(a) f(A UB) = f(A) US(B), 

(b) f(ANB)=f(A)Of(B). 

Examples 


(1) Favorite functions from Z to Z: 
f(x)=2", for all xe Z. 
g(x) =x +1, forall reZ. 


(2) A non-function: 
h(x) = either 1 or —1, for all x EZ. & 


Definition 1.8.3 (Composition of Functions). If f: A— B and g :B-— C, define the com- 


position of f and g to be (go f) (x) =g(f(x)), for all xe A. Then go f is a function and 
gof: AC. 


Note that g o fis not usually the same as fo g. For example, assuming A = B=C=R, if 
f(x) = and g(x) =x+ 1, then 


(gof\( =x +1, 
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while 


(fog)(x)=(x4+ 1)? =x + 2r+1. 


Exercise 1.8.2 Show that composition of functions is associative: that is, if f:A—>B, g: 
B->C, and h:C-D, then 


ho (gof)=(hog) of 


Definition 1.8.4 A function f: A—B is one-to-one (1-1 or injective) iff f(u) =f(v) 


implies u=v. 


Examples 


1. f:Z— Z defined by f(x) =5x is 1-1 since Z has no zero divisors. Thus 5u= 5v implies 
5(u — v) =O and therefore u — v= 0. This means that u = v. If we replace Z by Zo, then 
this function maps 2 (mod 10) to 0 (mod 10) and is thus not 1-1, since f(5) =f(0). 

2. f:Z— Z defined by f(x) = x? is not 1-1 since f(x) = f(—x). If we replace Z by Zz, then 
this function maps x to x* =x (mod 2). In short, it is what we call the identity function 
on Zz, since f(x) =x. 

3. This example is one of the most important in the following chapters. You should remem- 
ber such functions from linear algebra. Consider the set R™*” consisting of mx n 
matrices A over R. A matrix A=(4jj),<;;<,€R”*" gives rise to a linear function 
R”" > R” defined by y= Ty, (x) = Ax, where we are thinking of the elements of R” as 
column vectors and we define matrix multiplication as usual by writing 


Qi) *** Qin maT M 
, y=Ar=[: |, (1.3) 


Gm1 an Ann 1<ij<n In Ym 


n 
Y= > gx, for j=1,...,m. 


i=1 


You can similarly define the multiplication of matrix A ¢ R”*" by matrix Be R™*, by 
writing B= (b, --- by) where b; denotes the jth column of B, and then AB= (Ab, --- Ab,), 
multiplying each column vector of B by A to get the corresponding column of AB. These 
formulas will work with R replaced by any ring such as Z or Z,. This definition of matrix 
multiplication is due to Arthur Cayley. A 


Definition 1.8.5 f:A— B is onto (or surjective) iff for every b € B there exists a€ A such 


that f(a) =b or, equivalently, f(A) = {f (x) | re A} =B. 


Examples 


1. f:Z—Z defined by f(x) =x +5 is onto. For given y, you can solve y=x+ 5 for x. 
Answer: r= y— 5. 
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2. f:Z—Z defined by f(x) =5x is not onto, since not every integer is divisible by 5. 
Of course, if you replace Z with R, with the same definition of the function, namely 
Ff (x) = 5x, then f :R > R is onto. a 


A function that is 1-1 and onto is also called bijective or a bijection. We will not usually 
use these words - preferring to say 1-1 and onto. 


Example. The function g:Z—Z defined by g(x) =x+ 1, for all x€Z, is both 1-1 and 
onto. A 


There is a right identity for the operation of composition of functions. If f: A— B and 
I4(x) =x, Vx EA, then fo I, =f Similarly Ip is a left identity for f: that is, Ip o f=. 


Definition 1.8.6 Jf f:A—B is 1-1 and onto, it has an inverse function f—':B— A 


defined by requiring fo f—' =Ip and f—' o f= Ia. If f(a) =b, then f—1(b) =a. 


When the sets A and B are subsets of IR and the function f: A — B is 1-1 and onto, then 
one gets the graph of the inverse function from that of the function by interchanging the 
x- and y-axes. 


Exercise 1.8.3 Suppose that f : A— B is 1-1 and onto and g: BC is 1-1 and onto. Show 
that go f:A—C is 1-1 and onto. 


Exercise 1.8.4 


(a) Prove that if f: A— B is 1-1 and onto, it has an inverse function f—'. 
(b) Conversely show that if f has an inverse function, then f must be 1-1 and onto. 
(c) Show that if f : A—B is 1-1 and onto, then f-!: B— A is also 1-1 and onto. 


Exercise 1.8.5 State which of the following functions are 1-1 then state which are onto. 
(a) f: ZZ given by f(x) =2°; 

(b) f: ZZ given by f(x) = 3x; 

(c) f: ZZ given by f(x) =x — 3. 


Maybe we should have introduced the following definition earlier, but we are now more 
capable of dealing with it precisely. 


Definition 1.8.7 We say that the empty set has 0 elements. If nc Z*, we say that a set 
S has n elements iff there is a 1-1, onto function 


f:{1,2,...,n} 35S. 


Then we write n= |S|. If a set cannot be said to have n elements for any n€ Z* U {0}, 
we call it infinite. 


To a mathematician, the infinite is nothing more than “not finite” - no need to philos- 
ophize. However, it is also possible to differentiate between different orders of magnitude 
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of infinite sets. A countable or denumerable set S is in 1-1 correspondence with Zt. 
That is, there is a 1-1, onto function f:Z+-—S. Cantor basically invented this sub- 
ject and was not really popular for doing that. Many amazing things come to light - 
Q is countable, but R is not, for example. We do not really need to think about this 
much - mercifully. See Birkhoff and MacLane [9, Chapter XII] or Vilenkin [122]. The 
last reference explains the properties of countable sets with the story of the hotel with a 
countable number of rooms and its amazing abilities to accommodate guests. Sadly, how- 
ever, this countable hotel is not able to produce a countable list telling which rooms are 
occupied. 

We can use the function f : {1,2,...,2}— Sin the preceding definition to give a labeling 
of the elements of the finite set S with n elements: 


S= {f(1) =51,f (2) =S2,...,f(1) = Su}. 


Similarly for a countably infinite set S, one can use the 1-1, onto function f: Zt +S to 
label the elements of S: 


S= {f(1) =51,f(2) =s2,...,f(n) =Sn,...}. 


That is, one can view a countable infinite set as a sequence S= {Sn}, |. 

The following exercises give the basic principles of counting finite sets. The general 
theory of counting things is called combinatorics. A discussion of the basic principles can 
be found in the book by Clifford Stein, Robert L. Drysdale, and Kenneth P. Bogart [113] as 
well as those of Kenneth Rosen [92, 93]. 


Exercise 1.8.6 Show that for finite sets S,T, if SA T=Q, then |SUT|=|S|+ |T|. Then 
extend the result to a union of a finite number of pairwise disjoint finite sets using 
mathematical induction. 


Exercise 1.8.7 Show that for finite sets S,T, |S x T|= |S] |T]. 


Hint. You can derive this from the preceding exercise by writing S x T as a finite disjoint 
union of sets having the same order as S. 


Exercise 1.8.8 Show that for finite sets S,T, |{f:S>T}|= value 


Hint. This can be derived from the preceding exercise, once that exercise is extended to a 
product of n finite sets using mathematical induction. 


The following definition may appear silly but we will need it in later sections. Of course, 
you might be thinking that fand g in the definition are the same function, but that would 
be wrong thinking. The sets S and Tin the notation f: S— T actually matter greatly to the 
notion of the function fand its properties. 


Definition 1.8.8 Suppose f:S— T and RCS. Define the restriction of f to R, written 


Ff\lr=4g, to be the function g: RT defined by setting g(x) =f (x), for all xe R. 
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Example. Consider the function f:Z—Z defined by f(x) =x’. We know that fis neither 
1-1 nor onto. However f|z+ is 1-1, but still not onto. In fact f|z+ does not even map Zt 
onto Z*. A 


Exercise 1.8.9 (The Pigeonhole Principle). If A and B are finite sets, each with the same 
number n of elements, then f: A— B is 1-1 iff fis onto. 


Hint. Think of the elements of set A as pigeons and the elements of the set B as holes. 
Think of the f(a) = as putting pigeon a into hole b. So f is 1-1 means no 2 pigeons share 
a hole. And f is onto means every hole has a pigeon. See Figure 1.14. 


+> > % Ste 1 816 Figure 1.14 The pigeonhole principle 
aay —2 > OOO 
»» OO 


In 1834 Dirichlet formulated the pigeonhole principle. He called it the “Schubfachprinzip” 
which translates to “drawer principle.” 


Definition 1.8.9 For any function f :A— B, we define the inverse image of a set SC B 
to be the subset of A given by 


f-"(S)={aeA | fla)e S}. 


In the preceding definition we do not assume that the function fhas an inverse function. 
For example, consider f : ZZ defined by f(x) =’. Then 


f\(Z)=Z, fo! (Zt) =Z* u(-Z*),  f-* ({o}) = {0}. 


Exercise 1.8.10 Show that the inverse image has the following properties for f :S— T, and 
A,BCT. 


(a) f-\(AUB 
(b) f-\(ANB)= 

Now let us consider a few more counting problems. Recall that n!=1-2---(n—1)-n 
and 0! = 1. 


Definition 1.8.10 The symbol (,), read “n choose k” (the number of combinations of n 
things taken k at a time, not counting order) is defined to be 


n —k)! k(R—1)--+1 , 


= n! n(n —1)-+--(n—k+1) 
) mM (1.4) 


for n, k non-negative integers. 
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The symbol () represents the number of k-element subsets of a set with n elements. 
To see this, let us count the ways to create a k-element subset of an n-element set. First 
there are n ways to choose the first element of the subset. Then there are n— 1 ways to 
choose the second element, since it must not equal the first element. Continue in this way 
until you reach the kth element. There will be n — (k — 1) ways to choose this element. The 
product of all these numbers is the numerator on the right in equation (1.4). This numerator 
is really the number of ordered k-tuples of elements from an n-element set. But two sets are 
the same if their elements are permuted or rearranged: for example, {1, 2,3} = {2,3, 1}= 
{3, 1,2} = {1, 3, 2} = {3, 2,1} = {2, 1,3}. Thus there are k! of the k-tuples corresponding 
to one k-element set. It follows that we must divide by k!. 

The symbol 4 is also a binomial coefficient. The binomial theorem says 


n(n— 1) 


; ey + nxy"—} aeagt 


Exercise 1.8.11 Prove that 


a(t )+()-Cxt) = ()-(ot): 


Hint. For part (a), just use the definition and put everything on the left over a common 
denominator. 


Note that the equations in the preceding exercise are quite visible in Pascal’s triangle: 


1 1 
1 2 1 
1 3 3 1 
1 4 6 4 1 
1 5 10 10 5 1 


The nth row of Pascal's triangle gives the coefficients in the expansion of (x + y)". Accord- 
ing to the equation in part (a) of the exercise, the kth coefficient in row n is the sum of 
the two coefficients nearest it in the row above (row n — 1). Here we view the coefficients 
outside the triangle as 0. 

We name this triangle for B. Pascal (1623-1662), but it was known earlier to Indian, 
Persian, Chinese and Italian mathematicians. 


Exercise 1.8.12 Use mathematical induction to prove the binomial theorem: 


n 


a+y)*=>0 (1) ce saa 


k=0 


There is also a short - and thus preferable - combinatorial proof of the binomial theorem. 
Look at the coefficient of x*y"—* in the expansion of 


(aby): 


No 
n terms 
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This coefficient comes from choosing k of the xs and (n — k) of the ys. This is the same as 
the number of k-element subsets of an n-element set, namely (i). 


Exercise 1.8.13 Assume the standard facts from calculus for differentiable functions f and 
g that (f+ 9)' =f’ +g! and (fg)' =f'9 + fg’. Prove the formula for the nth derivative of the 
product of two functions, assuming each function has an nth derivative. Of course binomial 
coefficients appear in this formula. Thus to prove the result by induction is somewhat similar 
to proving the binomial theorem by induction. 


Exercise 1.8.14 Suppose that p is a prime and 1<k<p. Show that p divides the binomial 
coefficient (?). 


Exercise 1.8.15 Prove the arithmetic-geometric mean inequality, which states that assuming 
a,€Rt foralli=1,...,n: 


1 
ati tes + tn) 2 YE Lye 


The left side is the arithmetic mean and the right side is the geometric mean of the 
numbers xj. 


Hint. There are many proofs of this famous inequality. We are sort of hoping for a proof 
by mathematical induction here. That may be less revealing than the proof using Jensen’s 
inequality (applied to the function —log(x)) from the next exercise. You will find a huge 
number of proofs on the web. 


In order to do the next exercise, we must recall the definition of convex function from 
calculus and we should perhaps note that not all calculus books use the terminology of the 
exercise (for example, see Purcell [87] and Lang [63]). 


Exercise 1.8.16 Prove Jensen’s inequality concerning convex functions f :I +R, where I is 
an interval on the real line. We call f convex if f(ai11 + Q2%2) < arf (41) + a2f(%) for all 
41,42 € I and for all a1, a2 €(0,1) such that a; + a2 =1. This means that the part of the 
graph of y=f(x) for x between any two points x; and x in I lies below the line connecting 
the points (x;,f(x;)), i= 1,2. Jensen’s inequality says that if the weights a; € (0, 1) are such 
that a, +-+-+Q,=1 and if x,,...,X%, € I, then 


Sf (ae, +++ + Ontn) < arf (11) ++++ + Onf (tn). 


Setting all weights equal to 1/n and f (x) = —log x gives the preceding exercise. This inequal- 
ity seems a little easier to prove by mathematical induction than the arithmetic-geometric 
mean inequality. 


Exercise 1.8.17 (Principle of Inclusion-Exclusion). Suppose that S,,...,S, are subsets of a 
finite set S. Show that the number of elements of S; U Sz U---U Sy is 


n 
Yiisl-— SS ISiMSl+ So [SpA Sp|e +--+ 1)" [Sen S,. 
i=1 


1<i<j<n 1<i<j<k<n 
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Hint. Try the cases n=1,2, and 3 first. Then the easiest way is to proceed as follows. 
Consider a point p that is contained in exactly k of the sets S;. Count the number of times 
that p is counted in each term of the formula. Then make use of the binomial theorem to 
see that these numbers add up to 1. 


Definition 1.8.11 A binary operation on a set S is a function : S x S— S. We will often 


write (a, b) =ao b, for a, be S. Of course o may be replaced by all sorts of symbols: for 
example, +, X,*. 


We have already seen many examples of binary operations: for example, addition or 
multiplication on Z or Z,. The main property making for a binary operation is that it is 
well defined. In the next section we begin our discussions of groups - entities with one 
binary operation. 


Exercise 1.8.18 Which of the following are binary operations on the set Z? Why? 


(a) aob=¢; 
(b) aocb=ab’: 
(c) aob= Vab. 


Exercise 1.8.19 Consider the multiplication of matrices A, BE R™*™, defined by writing 
B= (b, --- bm), with columns b; € R™ and then AB= (Ab, --- Abm), using the multiplication 
from formula (1.3). Show that this operation satisfies the associative law A(BC) = (AB)C, 
for all A,B, CE R”*”. 
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2.1 What is a Group? 


Cayley gave the first abstract definition of a group in 1854. We have already seen some 
examples of groups: 


e the integers Z under addition; 
e the integers mod n, Z,, under the operation a+ b (mod n). 


The usefulness of the abstract concept is that it shortens the number of proofs we must 
produce. No longer must we prove the same thing over and over again for each special 
example. We pay for this by needing to imagine the abstract concept: a group with an 
arbitrary number of elements - maybe 20 billion, maybe 2, maybe infinitely many. 


Definition 2.1.1 A group G is a non-empty set of elements with one (binary) operation, 
which is a function from ordered pairs in G x G to G, taking (a,b) €G x G to a unique 
element a-b€G, such that we have three laws: 


1. Associative law: 
a-(b-c)=(a-b)-c, for all a,b,ceEG. 
2. Identity (call it e, or sometimes I or O or 1): there exists an element e € G such that 


a-e=e-a=a, forallacG. 


3. Inverses: given a€ G, there is an element a~! € G such that 


The group G is Abelian or commutative if for all a,b€ G, we have a-b=b-a. The fact 
that a,b€ G= > a- be Gis called closure. We will usually write ab (or perhaps a x b or 
a+b oraob) instead of a-b. 


Example 1. Z under addition forms a commutative group. The identity is 0. The inverse 
of xe Z is written —x. But Z under multiplication does not form a group, nor does 
Z — {0}. A 
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Example 2. The dihedral group D; of symmetries of an equilateral triangle has six elements 
D3 = {I, R, R’, F, FR, FR*}. 


The actions of these elements are pictured in Figure 2.1. We describe the elements in more 
detail in the following paragraphs - where we note that computing the product of elements 
of D3 is made easier when one uses permutation notation. This group D3 is the same as 
the group S3 of all permutations of the three objects: Blue, Pink, Yellow. Write numbers 
instead. Write 1 for the blue position, 2 for pink, and 3 for yellow. A permutation p in S; 
means a function p: {1,2,3}— {1, 2,3} which is 1-1 and onto. 


> 
[> 


I = identity R= rotate 120° counterclockwise 


> 
[> 


F = flip about vertical axis R7= rotate 240° counterclockwise 


> 
> 


FR = rotate 120°, then flip FR? = rotate 240°, then flip 


Figure 2.1 The symmetries of a regular triangle are pictured 


To compute in D3, it helps to use permutation notation because one may have trouble 
following what is going on when moving a triangle around in one’s head. Moreover, one 
can easily come up with two different permutations for many elements of D3. We will 
discuss the confusion in more detail in Section 3.7. Anyway, once you have translated the 
motions to permutations, the ambiguity is mostly gone - as long as you write permutations 
as functions in the usual way with the variable on the right, that is, f(x), not xf: This forces 
you to compose functions in the usual way, which also forces you to multiply the group 
elements in the corresponding way. 
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To alleviate the confusion, one really needs two triangles - a fixed triangle of buckets 
and a moving triangle of balls. Then we can use the permutation notation. A permutation 
in S3 is simply a 1-1, onto function ffrom {1, 2,3} to {1,2, 3}: 


— (ra) 10) ao): 


On the top we have the fixed buckets. On the bottom we have the moving balls. See Fraleigh 
[32, p. 36]. If you Google “permutation group action confusion,” you will find many pages 
of attempts to clarify the situation. You might argue that instead of Figure 2.1, we should 
have drawn a figure with two triangles. We argue that the fixed triangle is easily imagined 
to be lying comatose underneath the figure. 

Recall that we have identified the position of bucket 1 with the position of the blue ball 
in the top left triangle of Figure 2.1, the position of bucket 2 with the pink ball position, and 
bucket 3 with the yellow ball. Then we see that the flip F does not move the ball in bucket 
3 and interchanges the balls in the other 2 buckets. Therefore F is given by the following 
permutation 


12 3 
F= : 
2 1 3 


The rotation R moves the blue ball at bucket 1 to bucket 2 and the pink ball at bucket 2 to 
bucket 3, finally the yellow ball at bucket 3 to bucket 1. Thus R is given by 


ra(i 2 3). 
2 3 1 


So what is R- F? Instead of looking at the figure, just compose the functions - first F, 
then R: 


p.pui? * 2) 2 4). 
2 3 1/7 \2 1 3 


This acts on a number in {1, 2,3} on the right so that first 1 goes to 2 and then 2 goes to 
3 so that 1 goes to 3, for example. So we see that: 


R-F=F-R= Lue 
s. 2° 4s? 


What is R?? 


i 23 
R= 
(; 1 ;) 


What is R?? That answer had better be the identity, J, which does not move any balls. 


1 2 3 
w=(| 4 = = identity. 


We can create a multiplication table for the group once we have all the elements. The 
last is R? -F=F-R. Thus D; is not commutative. We see that 


12 3 
FR=R?. F= ; 
13 2 


45 


46 


Part | Groups 


With the above computations, we can create the multiplication table for D3, which is 
incomplete until you do Exercise 2.1.3. 


Multiplication table for D3 


I R R | F RF | R’F 
I I R R | F RF | R’F 
R R R | I RF | RF | F 
R R [I R | RF/|F RF 
F F R’F | RF 
RF || RF 
RF || RF 


A 


Exercise 2.1.1 Prove that D4, the group of symmetries of a square, is not the same as Su4, 
the group of permutations of four objects. 


Hint. Note that S, has 4-3-2= 24 elements while D4 has only eight elements. The group 
D, contains a 90° counterclockwise rotation R and two kinds of reflections: reflection F, 
across a diagonal and reflection F, across an axis connecting the midpoints of opposite 
sides. 


Definition 2.1.2 For n¢ Z*, the symmetric group S, is the group of 1-1, onto functions 
from {1,2,...,n} onto {1,2,...,n}, with the operation composition of functions. The 
elements of S; are called permutations of the integers 1,2,...,n. 


Definition 2.1.3 Forn€Z*, n> 3, the dihedral group D,, is the group of rigid motions 
of a regular n-gon. The group operation is composition. 


Neither S, nor D, is a commutative group when n > 3. 


Exercise 2.1.2 Show that D,, and S, from Definitions 2.1.2 and 2.1.3 cannot have the same 
number of elements for n> 4. 


Hint. Note that if you have two adjacent vertices of the regular n-gon, say 1 and 2, then 
any element o € Dy, maps them to adjacent vertices a(1) and o(2). 


Exercise 2.1.3 Finish the multiplication table for D3. What is the inverse of R? What is the 
inverse of RF? Explain why D3 is a group. 


If you are given a multiplication table and asked to check that the set forms a group, 
it is easy to check properties 2 and 3 in the definition of a group, but hard to check the 
associative law (property 1). For a group of with six elements, there are 6? equations a(bc) = 
(ab)c that must be checked. However, in our case this is automatic, since our group elements 
are functions mapping the set {1, 2,3} 1-1 onto itself and composition of functions is 
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associative. See Exercise 1.8.2. Multiplication tables of groups always have every element of 
the group in each row and column, as we shall see in Corollary 2.3.1. See also Exercise 2.2.4. 

When considering Figure 2.1, you could also imagine that R moves 1 to 3, moves 2 to 
1, and moves 3 to 2. That would actually give you the inverse permutation to the one we 
wrote down above. This is the essence of the confusion. However, it does not really matter 
what you do - as long as you are consistent. Also, once you have the permutations, for R 
and F, you can stop looking at the triangle - always assuming that you are multiplying 
permutations correctly. See Section 3.7 for more information on permutations. 

Symmetry groups arise in many areas such as chemistry, physics, statistics - even biology. 
You can replace the two-dimensional figures we just considered by three-dimensional or 
even higher-dimensional figures and look at their rigid motions. 

How do you recognize the symmetry group of a figure in the plane? For the standard 
figures of classical plane geometry, it is the group of functions from the plane to itself that 
preserve the standard Euclidean distance and map the figure onto itself. There are three 
types of symmetries of figures in the plane: 


(1) reflection across a line, 
(2) rotation around the origin, 
(3) translation by a fixed vector b, x 4x + b for any vector re R?. 


We have seen examples of figures with symmetries of types 1 and 2 for the dihedral 
group. To see examples of translational symmetries, look at Figure 2.2, which is supposed 
to stretch out to infinity to the left and to the right. 


TPT A lL le rT we we Te lala t ele Tt lt ale 
Fi = i i i a 


Figure 2.2 Part of a design with translational symmetry which should be imagined to stretch out to 
co and —oco 


The rotations of a planar regular n-gon form a group with n elements called C,, the cyclic 
group of order n. The group C,, is: 


OSLER sk SR, (2.1) 


where R is counterclockwise rotation by 27/n radians. The group operation is again com- 
position of functions. C, has a particularly easy multiplication table. It is really the same 
as that for the integers modulo n under addition. Each row is obtained from the row above 
by shifting to the left and moving the first element to the end. 


Multiplication table for Cn 


T R R2 ae RR? R'! 
I I R Re |... | R"-* | RI 
R R ae Rose. Re od 
R’ ig R Re |---| I R 
R™2 R"-2 R" I a R'-* R33 
R*-! R'-! I R fobua R’-3 R’-2 


The upper left quarter of the multiplication table for D; is the same as the multiplication 
table for C3. 
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Example. The object in Figure 2.3 has rotational symmetry but not reflective symmetry. 
Thus its symmetry group is C, and not Dg. See Gallian [33] for many more examples of 
this sort. A 


Figure 2.3 A figure with Cs symmetry - not Ds 
symmetry 


Sometimes art is pleasing thanks to a mixture of symmetry and asymmetry. This is so 
with Figure 2.4 - a picture of a wall hanging from Raja Ampat in the Indonesian part of 
New Guinea. A good exercise would be to describe all the various symmetries and lack 
thereof in the figure. 


Figure 2.4 Art from the Raja Ampat islands in 
the Indonesian part of New Guinea 


Exercise 2.1.4 Find the symmetry groups of Figures 0.1, 0.2, and 0.3, which are in the 
preface. 


Exercise 2.1.5 Is the following table the multiplication table of a group G= {a, b,c, d} of 
order 4? 


* 


Apo; oa a] es 
ay ;ay;aoy;aya 
Qypo;ays|] a 


Qypo;as 
Qa;a;a};ays 
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Exercise 2.1.6 State which of the following are groups and why. 


(a) The integers Z under addition 

(b) Z under multiplication 

(c) The real numbers R under addition 
(d) R under multiplication 

(e) R — {0} under multiplication 

(f} Z — {0} under multiplication 


The following three exercises give a few basic facts about groups. We will deduce more 
such things in Section 2.3. 


Exercise 2.1.7 Show that the identity element of a group is unique. Then show that, for 
aéG, the element a~'! is unique. 


Exercise 2.1.8 Show that, in any group G, if a,b€ G, then (ab)-!=b-!a7!. 


Exercise 2.1.9 Show that in a group G, if ac G, then (a-1)~" =a. 


Exercise 2.1.10 Show that in a group G, if a,b € G and (ab)’ = a*b?, then ab =ba. 


Exercise 2.1.11 Define the operation % on Z by ax&b=a-— b. Does this operation make Z 
a group? 


Exercise 2.1.12 State which of the following form groups: 


(a) the irrational numbers under multiplication; 

(b) the rational numbers Q under addition; 

(c) the rational numbers Q under multiplication; 

(d) the nonzero rationals Q — 0 under multiplication. 


Exercise 2.1.13 Show that S, has n! elements and D, has 2n elements. 


Hint. To count the elements of Dy, consider a regular n-gon. Label its vertices 1,2,...,n. 
Define the rotation R to be a counterclockwise rotation through an angle of am Let F bea 
flip about the axis through vertex 1 and the center of the regular n-gon. Any element o of 
D, must fix the center of the n-gon and is determined by the image of two adjacent vertices 
such as 1 and 2. Vertex 1 can be taken by o to any of the n vertices - say v. Vertex 2 must 
be taken to one of the two adjacent vertices to v. Then one can show that o = R' or FR’, for 
some i. Thus 


DH Sci PER ER ying NS 


In the preceding exercise, we use the notation (R, F) to denote the group of products of 
powers of R and F, which is called the group generated by R and F. Similarly we wrote 
(R) to denote the cyclic group generated by R in equation (2.1). We will say more about 
this idea when we discuss Cayley graphs of groups in the next section. It is also important 
to know the relations between the elements of (R, F) such as R" = I= F’ = FRFR. This is a 
different use of the word “relation” than that of Section 1.7. 
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There is much more that can be said about symmetry. Felix Klein (1849-1925) said “a 
geometry is a space with a group of transformations.” Richard Brauer said in 1963: “Groups 
are the mathematical concept with which we describe symmetry.” Hermann Weyl wrote an 
interesting book on the subject in which he gave an introduction to symmetry in physics 
and mathematics (see [125]). In a related shorter paper he said [126, pp. 592-610]: “By far 
the most fertile application of symmetry in the whole inorganic world has been made by 
quantum physics in studying the atomic molecular spectra.” He also noted: “Relativity is 
nothing else than the problem of determining the group of automorphisms of space itself.” 
Another reference for more general types of symmetry is the book by David Mumford, 
Caroline Series, and David Wright [80]. Non-Euclidean symmetries are used to produce 
some beautiful figures - including a few that go back to books of Klein and Fricke - figures 
produced by hand in the 1800s. The artist M. C. Escher created the beautiful circle limit 
designs using non-Euclidean symmetry. See Doris Schattschneider’s book on Escher [100]. 
In the next section we will investigate ways to produce visions of symmetry which are 
infinitely more pedestrian than Escher’s methods. 

To close this section, we display three figures - one with Euclidean symmetry, one with 
spherical symmetry, and the other with non-Euclidean symmetry. Such symmetries are 
discussed in our book [118]. Figure 2.5 is a density plot of a two-term Fourier series in two 
variables and was produced with the Mathematica command below. 


DensityPlot [Sin [4*Pi*x] *Sin[Pi*y]+Sqrt [2/3] *Sin[Pi*x] *Sin[4*Pi*y] , 
{x,-4,4},{y,-4,4}, ColorFunct ion->Hue, PlotPoints->500,Frame->False] 


Figure 2.5 Wallpaper from a Fourier series in two 
variables 


Figure 2.6 is a density plot of a spherical analog produced with the Mathematica command 

below. 

ParametricPlot3D[{Cos[¢] *Sin [0] ,Sin[¢]*Sin[6] ,Cos[0]},{¢,0,27},{0,0,7}, 
PlotPoints->300,Mesh->None, 


ColorFunction->Function[{x,y,z,¢,6}, Hue [Re [SphericalHarmonicY [14,7,0,¢]]]], 
ColorFunctionScaling->False,Axes->False, Boxed->False] 


Figure 2.7 is a density plot of a non-Euclidean analog and was produced with the 
Mathematica command below. 


DensityPlot [Abs [ (y*6) *DedekindEta [x+I*y] *24],{x,-2.5,2.5},{y,0,1}, 
ColorFunction->Hue, PlotPoints->500,AspectRatio->1/2,Frame->False] 
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Figure 2.6 Spherical wallpaper from spherical 
harmonics 


Figure 2.7 Hyperbolic wallpaper from a modular form known as A on the upper half plane - a 
function with an invariance property under fractional linear transformation (az+ b)/(cz+ d), 
where a,b,c,d€ Z and ad— bc=1 


For Figure 2.5 the symmetry group is Z? consisting of vectors (a, b) such that a,b EZ 
with the operation vector addition. For Figure 2.6, the symmetry group is the group of rota- 
tions of the sphere with the operation composition. For Figure 2.7 the symmetry group is 
the modular group SL(2,Z) of 2 x 2 matrices with integer entries and determinant 1 with 
the operation matrix multiplication. The functions being computed with the Mathematica 
commands above are those involved with Fourier analysis for the three groups. Such anal- 
ysis can be applied to physical problems having symmetry described by these groups. In 
the plane, one might consider the vibration of a rectangular plate. That would involve 
Fourier series in two variables. In another example, earthquakes cause the earth to vibrate 
and such motions can be described using spherical harmonics. Sometimes the symmetry 
requires some torturous reasoning. Consider number theory problems such as Fermat’s last 
theorem - which is surprisingly related to the modular group. A book that uses the same sort 
of methods that we used to create Figures 2.5-2.7 in order to create symmetric wallpaper 
is that of Farris [30]. 
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2.2 Visualizing Groups 


We have multiplication tables which help us to visualize groups. Using the program Group 
Explorer, for example, we get Figure 2.8 for the multiplication table of a cyclic group G= (a) 
of order 6 and Figure 2.9 for the multiplication table of the dihedral group D; (alias S;). 


Figure 2.8 Group Explorer version of the multiplication table 
for Ce, a cyclic group of order 6 


Figure 2.9 Group Explorer version of the multiplication table 
for D3 (alias S;) with our upper case R and F replaced by lower 
case letters 


But Cayley gave us another way to visualize these groups, using directed graphs (which 
are just sets of vertices with directed edges (arrows) connecting them). In fact, he even 
colored the edges. Every element of the cyclic group (a) is a power of a. We say that a is 
a generator of (a). Every element of D3; is a finite product of powers of R and F, where 
R is a rotation by 120° counterclockwise and F is a flip about a vertical axis through the 
top vertex and the midpoint of the bottom edge. We say that {R, F} generates D; and write 
D; = (R, F). In general, a subset S of a group G is said to be a set of generators of G iff 
all elements of G are finite products of elements of S. Associate a color to each element of 
S. The Cayley graph X(G, S) has vertices corresponding (in 1-1 fashion) to the elements 
of the group G. Then for each element s € S and each vertex g € G draw an arrow with the 
color corresponding to s from vertex g to vertex gs. We get a directed graph with colored 
edges. Figures 2.10 and 2.11 show Cayley graphs for G=C, = (a) with S= {a} and for D; 
with S= {R, F}. However, we have given the edges the same color in Figure 2.11. Because 
Fhas order 2, the edges corresponding to F are undirected. 

We could also create Cayley graphs by drawing edges connecting vertex g to vertex sg, 
for se S. For commutative groups this would not matter but otherwise it could create a 
different graph - yet another right-left problem. Moreover there are many ways to draw a 
graph with more than two vertices. 

Of course you can take different generating sets for the same group. In Figures 2.12 and 
2.13 we choose G= C,= (a), with S={a,a°} and S= {a, a’,a°}, respectively. Instead of 
putting arrows on the edges, since both directions are present on each edge of these two 
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Figure 2.10 Cayley graph of cyclic group G= (a) of order 6 
with generating set S= {a} 


Figure 2.11 Cayley graph of D; with generating set {R, F}. 
Our upper case letters are replaced by lower case in the 
diagram. If there are arrows in both directions on an edge, 
we omit the arrows 


Figure 2.12 Undirected version of Cayley graph for G = (a), 
generating set S={a,a '} 


Figure 2.13 Cayley graph of C. = (a), generating set 
S= {a, aia } 


4 OK? 
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graphs, we just draw undirected graphs - that is, graphs with no arrows on the edges. Again, 
we have used the same color for the edges. 

The fact that S is a generating set for the group G manifests itself in the Cayley graph 
X(G,5S) by insuring that any two vertices in the graph have a path connecting them 
consisting of a sequence of directed edges. In this situation, we say that the graph is 
connected. 


Exercise 2.2.1 Define the Klein 4-group as the group Z>, consisting of vectors (a,b) with 
a, b €Z2 where the group operation in componentwise addition mod 2. Draw a Cayley graph 
for the Klein 4-group with generating set S= {h,v}, h=(1,0),v=(0, 1). The multiplication 
table is in Figure 2.14. Does this Cayley graph look like that for C, = (a) with S={a,a~1}? 
What does that tell us about using Cayley graphs to understand groups? 


Hint. Relations a*=e in Cy and 2h=2v=(0,0) in Z3 correspond to closed paths in the 
graphs. 


Figure 2.14 Group Explorer version of the multiplication table for the 
Klein 4-group 


d 7 


Exercise 2.2.2 Find the symmetry group of a rectangle which is not a square. 


Exercise 2.2.3 Find the symmetry group for each of the designs in Figure 2.15. 


Figure 2.15 Symmetrical designs 


Exercise 2.2.4 Prove that every multiplication table for a finite group is a Latin square - 
meaning that every row contains every element exactly once, the same for the columns. 
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Exercise 2.2.5 Draw the Cayley graph for D4 with the generating set {R,F}, where R is a 
counterclockwise rotation of the square through 90° and F is a flip about an axis through 
vertex 1 and the center of the square. 


Exercise 2.2.6 Consider the affine group 


Aff(3) -{ é 4 


Show that Aff(3) is a group under matrix multiplication as defined in formula (1.3) of 

Section 1.8 and the statement that follows it. Draw a Cayley graph for this group with 
2 0 1 1 

generating set S= { € ap (5 \ Compare with the Cayley graph X (D3, {R, F}). Are 

these groups really the same in some sense (a sense to be known to us as isomorphic groups 

in Section 3.2)? 


a,bEZs, apo} : 


We want to restrict ourselves mainly to undirected graphs here. This means that there 
are no arrows on the edges. A Cayley graph X(G,S) is undirected if the edge set S is 
symmetric meaning that s¢ S implies s~'€ S. Associated to an undirected graph with n 
vertices V={v,,...,v,} is a symmetric n x n matrix of Os and 1s called the adjacency 
matrix A. The i,j entry is 1 if there is an edge connecting vertex v; with vertex v;. The 
matrix A is symmetric because the i,j entry is the same as the j,i entry - which is true 
because the edge set is symmetric. The properties of the adjacency matrix are of great 
interest in graph theory, computer science, and chemistry. If the graph is a Cayley graph, 
the group has much to say about the adjacency matrix. We will say more about this in 
Section 4.2. 


Example. Consider an adjacency matrix for the Cayley graph of Z, under addition (mod 5) 
with generating set {1,—1 (mod 5)}; that is, X(Zs, {1,1 (mod 5)}). We list the vertices 
in the usual order {0, 1,2, 3, 4} to obtain the matrix. 


0) 
( 

(0) 
le 

1 

The Cayley graph also tells us about relations in the group. This idea is not to be confused 
with the relations in Section 1.7. The relations we discuss here are equations involving 
“words” made up from the generators of our group and their inverses. For example, look 
at Figure 2.11. Follow the path going around the outside triangle clockwise starting at the 
top vertex and returning to it. This corresponds to the equation RRR = R? =I which is a 
relation in D3. Remember that, although we use capital letters for the group elements in the 
text, our figures have lower case letters. On the right side of Figure 2.11, the path which 
starts at e having four edges gives a second relation RFRF =I in D;. One way to define a 
group is to give its generators plus a set of relations. Cayley graphs do this in pictures. A 
presentation of a group G is a pair (S: R), where S is a set of generators of Gand R is a 


set of words in these generators representing relations saying that the word = the identity 
and such that the relations in R generate all relations involving words in elements of G. We 


oor CO 
oOororo 
ome) 
oO FR 


- OK 
om, 
~~, 


A 


55 


56 


Part | Groups 


will say more about presentations of groups via generators and relations later. For example, 
a presentation of the dihedral group D, is given by D,, = (R, F: R" = F’ = RFRF=1). Group 
Explorer lists groups this way. In one sense a group presentation tells you everything about 
the group. For a large group, however, not so much - just as a large Cayley graph is pretty 
hard to use to develop an understanding of the group. 

The free group on a set S of generators means the set of all possible words in the genera- 
tors and their inverses, with no relations. If S has one element, this group can be identified 
with Z, but otherwise it is an infinite non-Abelian group. 


Exercise 2.2.7 Consider the set SO(2) consisting of matrices 


cos@ —sin@ 
m(6) = Ge o- , ford€R. 


Show that SO(2) is a group under matrix multiplication as defined in formula (1.3) of 
Section 1.8 and the statement that follows it. This is called the special orthogonal group. 


1 
What is the effect of the group element m(@) upon the vector v= @ when we multiply 
m(0)v? Is the group SO(2) commutative? 


Hint. Identify vectors '(x,y) in the plane with complex numbers x + iy, i=./—1. Here 
T(x, y) denotes the transpose of (x,y): that is, the corresponding column vector. Then 
multiplication by m(@) corresponds to multiplication of x+ iy with &® =cos 6 + isin@. 


Exercise 2.2.8 Suppose that G is a group with identity element e. Show that if g* =g -g=e 
for all g €G, then G is Abelian. 


Exercise 2.2.9 Write down an adjacency matrix for the Cayley graph of Z, under addition 
(mod 6) with generating set {1,—1 (mod 6)}; that is, X(Ze,{1,—1 (mod 6)}). List the 
vertices in the usual order {0,1,2,3, 4,5}. 


Exercise 2.2.10 Write down an adjacency matrix for the Cayley graph X (D3, {F,R, R~'}), 
using the notation of the previous section and listing the vertices in the order 
{I, R, R’, F, FR, FR*}. Compare this result with that of the preceding exercise. 


2.3 More Examples of Groups and Some Basic Facts 


For some more examples of groups, consider the symmetry groups of the five regular 
polyhedra (the Platonic solids): tetrahedron, cube, octahedron, dodecahedron, icosahedron; 
these are shown in Figure 2.16 - drawn by Mathematica. For more information on them, 
see Wikipedia under Platonic solids or groups or 


http://www.dartmouth.edu/~matc/math5.geometry/unit6/unit6.html#Elements 
or 
http://www-history.mcs.st-and.ac.uk/~john/geometry/Lectures/L10.html. 


The symmetry groups of the Platonic solids are quite interesting finite groups called the 
tetrahedral group (A,) of proper symmetries of a tetrahedron, the octahedral group (S4) of 
proper symmetries of an octahedron or cube, and the icosahedral group (A,) of proper 
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Dodecahedron Icosahedron Octahedron 


Tetrahedron Cube 


Figure 2.16 The Platonic solids 


symmetries of an icosahedron or dodecahedron. Here “proper” means that the symmetry is 
obtained by an orientation-preserving rotation that one can actually perform in 3-space. 
They correspond to a 3 x 3 real matrix g in the special orthogonal group SO(3, R), mean- 
ing that ‘gg =I (the identity matrix, which is a diagonal matrix with 1s on the diagonal) 
and with det g = 1. Here 'g denotes the transpose of the matrix g, which means the matrix 
whose rows are the corresponding columns of g. These groups can all be identified with per- 
mutation groups. We will define the alternating group A, in Section 3.1. It is a subgroup of 
the symmetric group S, which consists of half the elements of S,, (the “even”permutations). 

We should note here that many authors throw in the improper rotations which cannot 
actually be performed in 3-space - just as the flips in D, cannot be performed in the 
plane. The group corresponding to all proper and improper rotations of 3-space is the 
orthogonal group O(3,R) of 3 x 3 real matrices g such that 'gg=J. Allowing improper 
rotations produces groups of symmetries that are twice as large as those we spoke of above. 
Thus in Wikipedia, for example, if you read the entry on the tetrahedral group you will find 
that they are talking about S, rather than A,. Moreover they will use the chemist’s notation 
Ta instead of Sq. 

The tetrahedral group is important to chemists because it is the symmetry group of var- 
ious molecules such as methane CH,. This was discovered in 1913 by W. H. Bragg and his 
son when they found that diamonds have tetrahedral symmetry. A diamond is a crystal 
of carbon atoms which are tetrahedrally bonded. The Braggs were pioneers in X-ray crys- 
tallography. You might ask why the molecule benzene (CgH¢) - which is seemingly more 
complicated than methane - has a simpler symmetry group, the cyclic group of order 6. This 
was proved by Kathleen Lonsdale in 1929. She was a student of the senior Bragg and the 
first woman tenured at the University College London. Such things were proved using X-ray 
diffraction. This subject will be discussed a bit more in Section 4.2 but I fear that we will not 
really answer the question without doing a study of X-ray diffraction spectroscopy. This 
would require us to know more about representations of groups and quantum mechanics 
than is feasible for a short book. 

Chemistry is not the only field that makes use of the groups of the Platonic solids. For 
example, the dodecahedral group is the symmetry group of many viruses - viruses such 
as those causing polio and herpes. There is hope that understanding the symmetries of 
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viruses will help to find vaccines. Many papers on the subject have been written by Reidun 
Twarock at the Departments of Mathematics and Biology, York Centre for Complex Systems 
Analysis, University of York. Some beautiful pictures of viruses can be found in her papers 
and in Wikipedia articles on the subject. Or just Google images of icosahedral viruses. 

As we said in Definition 2.1.2, the symmetric group S, is the group of 1-1, onto functions 


f:{1,2,...,n}— (1, 2,...,n} 


with group operation given by composition of functions. We used the notation 


& 2 ere n ) 

1) 2) flay)’ 

This allows us to see that S, has n! elements. For there are n ways to choose /f(1). Then, 
since f is 1-1, there are n— 1 ways to choose f{2), and n — 2 ways to choose f(3) as 
(3) cannot equal f({1) or f(2). Keep going in this way (inductively). At the end there is 
only one way to choose f(n). Therefore we find that the number of elements in S, is n! = 
n(n—1)(n—2)---2-1. Thus S, has 4-3-2-1=24 elements The alternating group A, 
consists of “even” permutations and always has half the number of elements of S,, as we 
shall see in Section 3.1. Thus A, has 12 elements, and A; has 60 elements. We will study 
permutation groups in detail in Section 3.1 and in later sections. Let us return, for the rest 
of this section, to some more familiar groups. 


Exercise 2.3.1 Describe the elements of the group of proper symmetries of a regular tetrahe- 
dron as a subgroup of S4. Show that — besides the identity - there are two sorts of rotations - 
one fixing only one vertex and the other fixing no vertex. We really need results from Sections 
3.1 and 3.7 to give the whole story. 


The groups of symmetries of the Platonic solids are fairly complicated. Let us consider 
some simpler groups instead. 


Example 1: The Group Z, under Addition mod n. The elements of Z,, are really equivalence 
classes for the equivalence relation of congruence modulo n. Here we define the sum of 


equivalence classes by summing representative elements of the classes. That is, setting 
[a] = {b | b=a (mod n)}, we have: 


[a] + [b] =|a+ b (mod n)}. 


Or one could just say Z, = {[0], [1],...,[”—1]} and the operation is addition mod n. 
One should really check that the operation is well defined. That means we should show that 
a=a' (mod m) and b= b’ (mod m) implies a + b= a’ + b’ (mod m). This was Exercise 1.6.5. 
That the operation of addition mod n turns Z,, into a group requires us to find an identity 
which is 0 (mod n) and the inverse of a (mod n) which is —a (mod n). The associative law 
follows from that for Z. 

Compute the Cayley addition table for Z.. Each row of the table is obtained by shifting 
the row above it to the left (extending the row above by adding a copy of it to the right). 
This is quickly seen to be a colorless version of the Cayley table in Figure 2.8. A 
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Table for Ze under addition 
+mod6 |/ 0 2 
0 


NR] Re] O}] Mm] BY] WwW] w 
Wl NM] Re} oO] Mm] A] 
Bl wl ND! Re}] oO] mM] wm 


CO} Mm] PB] WwW] ND] RI Re 


OM} PB} Wo} mR] Re 
MW} BP} Wl mM] R]| oO 
Rel} O] Um] BB] Ww] b& 


The diagonals going from lower left to upper right are constant. This is a cyclic group 
generated by the element [1], meaning that 


V [4] € Ze, Jn € Z*s.t. [4] = n[1] = [1] see [L 


n times 
These are the easiest groups to deal with. We study such groups in Section 2.5. 


Example 2: The Unit Group Z*. What can we do about multiplication mod n? We make the 
analogous definition to that for addition. Thus, we multiply in Z, by writing 


[a] [b] = [ab]. 


This operation is also well defined: that is, [a] =[a’] and [b] = [b’] imply [ab] = [a’b’]. This 
was proved in Exercise 1.6.6. 

We need to take a subset of Z, to get a group under multiplication. This subset is the 
unit group. A 


Definition 2.3.1 The group of units mod n is defined by 
Zs ={|a]=a (mod n) | ged(a, n) = 1}, 


n 


with the operation [a] [b] = [ab]. 


We call the elements of Z* “units” because they are the invertible elements for multipli- 
cation in Z,. Thus Z* is the set of [a] =a (mod n) such that 3b with ab=1 (mod n). To 
prove this, note that a € Z* implies that 


1=gcd(a,n)=ab+mn, for some integers b, m. (2.3) 


Here we use Theorem 1.5.2 which says that d= gcd(a, n) is the smallest positive integer d 
of the form d= ab + mn, for integers b, m. But equation (2.3) implies that ab =1 (mod n). 
Thus a is invertible for multiplication (mod n). 

Conversely if ab = 1 (mod n), then ab — 1= cn, for some integer c, and then 1 =ab — cn. 
This implies that 1 = gcd(a, n) (again using Theorem 1.5.2). 

Thus we can check that Z* is a group under multiplication. To see that Z> is closed 
under multiplication, note that [a] ,{b] < Z;, means that there are integers c and d such that 
ac =1 (mod n) and bd=1 (mod n). This implies that abcd =1 (mod n) and [ab] € Z*. 
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The associative law follows from that for the integers; that is, (ab)c=a(bc) for inte- 
gers a,b,c implies (ab)c=a(bc) (mod n). The identity is 1 (mod n) using the fact that 
ged(1,n)=1. 

Inverses exist in Z* as we proved above. For example: to find the inverse of 2 (mod 5) 
you need to find b such that 2b =1 (mod 5). The solution is b= 3 (mod 5). You can find it 
by trial and error. Try all values of b (mod 5) and see which one works. 

Another way to find b is to compute the gced(2,5)=2b+ 5c, using the Euclidean 
algorithm. Note that 1/2 does not make sense in Z? except to write 2~! is 3 (mod 5). 

Look at the case n=6. Z{ = {1,—1 (mod n)}. The multiplication table for Zé below is 
essentially that of a cyclic group of order 2. 


- (mod 6) 1 | -1 
1 1 | -1 
-1 -1 1 


Exercise 2.3.2 Show that there is no group G of integers mod n with the operation 
multiplication mod n such that ZS GC Zp with the unit group Z;, defined by (2.2). 


How large is the unit group Z*? The answer depends on n. For example, if n=5, we 
see that Zé has all four nonzero elements of Z;. But Zé has only the two elements, 1 and 
4 (mod 6). The number of elements of Z* is defined to be the Euler phi-function (7). It is 
easy to check that 


o(6)=2, o(8)=4, and 4(12)=4. 


Exercise 2.3.3 
(a) Prove the last three results by listing all the elements of Z*, forn =6, 8, and 12. 
(b) Show that d(p*) = p* — p*—', for any prime p, and exponent e = 1, 2,3,.... 


Hint for part (b): count the multiples of p between 1 and p*. 


To compute ¢(n) in general, you need to factor n into a product of pairwise distinct prime 
powers. Then for pairwise distinct primes p; it can be proved (see Exercise 2.3.12) that 


b(pi'p? ++ Pr) = 9 (pl) ¢ (py) «+d (pF) 
= =) Ge ae a ee). (2.4) 
This fact really comes from the Chinese remainder theorem (see Section 6.2). 


The multiplication table for Zj = {1,3, —3,—1 (mod 8)} is a bit more interesting than 
that for Zé. 


-(mod 8) | 1 | 3 | -1]-3] 
1 1 | 3 | -1]=3] 
3 3 t | =o) | 
= = el ee 3 | 
—3 —3|-1] 3 1 | 


With a relabeling of entries this is the same multiplication table as that of the Klein 
4-group. See Exercise 2.2.1 and Figure 2.14. Felix Klein named it the 4-group in 1884. 
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Note that every element a of this group has the property that a-a= identity. Also the 
group is commutative. 
Next let us consider a group of 2 x 2 matrices under matrix multiplication. 


Example 3: The General Linear Group GL(2,Z,), where p is aprime. You could also replace 
Z, with R, the real numbers. The general linear group is defined by 


b\ | % 
GL(2,2,)= { a= é ‘) aeeaeey: detg=ad— bee z; b, (2.5) 


where the operation is matrix multiplication 


a b\ (x y\_ far+bu ay+bv 
¢ . & ') 7 ee du a) 
The definition of matrix multiplication is the same as for matrices of real numbers in for- 
mula (1.3) of Section 1.8, except that all computations are with integers modulo p. Note 
that (2.5) requires that the determinant be in the unit group Z;. 

Why is GL(2,Z,) closed under multiplication of matrices? You need to know that 
det(AB) = det(A) det(B). Since we are assuming det(A) and det(B) are in the unit group 
Z,, it follows that the product det(A) det(B) is also in Z;. The reason is that Z; is a group 
under multiplication - as we saw in the last example. We will have more to say about 
determinants in Section 7.3. 


The identity in GL(2, Z,) is the identity matrix (5 i. 


The inverse of (° i) is 
c d 


-1 
({ i) = (ad — be) iS a) 
where all computations are done modulo p. 

Here the reciprocal of the determinant is computed mod p, just as we compute inverses in 
Z,,. We are not computing using fractions in the rational numbers. Thus t =a~!=b really 
means b is an integer such that ab= 1 (mod p). This is what happens in Z;,. Note that the 
inverse is in GL(2,Z,) since det(M_') = (det(M))~!. 

Matrix multiplication is associative because it comes from composition of functions. 
We saw this in Exercise 1.8.19. What is the function? For F=Z,, with prime p, let F’ be 
the two-dimensional vector space over F: 


6 


Here we define the sum of vectors and product with scalar a € F by 


()+()-Gx2). =). as 


where all operations are mod p. 


nyerh. 
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b 
Given a matrix M= é ), we have a linear function Ty: F — F’ given by 
c 


d 
x\ (a b\ (x _ far+ by 
tu(5) 7 é ‘) (;) 7 a) (2.7) 


1 0 
The Ty function uniquely determines the matrix, since M= (ru($) tu(¢)). 


Composition of these functions corresponds to multiplying matrices. That is, Tw = Ty © Ty. 
Thus matrix multiplication is associative. If you feel like filling up a page with computation, 
you could also check that multiplication of matrices is associative by multiplying out three 
matrices with parentheses arranged in two ways. See Section 7.1 for a discussion of vector 
spaces. 

One can obtain an infinite general linear group GL(2, R) by replacing Z, with R and Z, = 
Zy — {[0]} with R* =IR— {0}. The elements of GL(2, R) are called non-singular matrices in 
linear algebra classes. A non-singular square matrix is a matrix with nonzero determinant 
or equivalently a matrix with a multiplicative inverse. See Exercise 7.3.13. A 


Exercise 2.3.4 For R=Z,, show that R? = {(x,y) | x, y€ R} is a group under addition using 
the componentwise definition of addition in (2.6). 


Take p=5 and look at the following example. 


Special Case. In GL(2, Zs), we get the following results. All computations are mod 5. Thus 
1/(—2) is 2 (mod 5), since 2(—2) = —4=1 (mod 5). 


G2 3G )=0 9-0) 


To check this, we multiply out the two matrices to see if we get the identity matrix. 


1 2\/3 1\ (3+8 1+4)\ /1 0 
3 4/\4 2/ \9+16 34+8/ \o 1/° 


Again the arithmetic was all mod 5. Similarly one checks that 
S Aft 2) 7% 0 
42/\3 44 \o ly 

Question. How big is GL(2, Zs)? 


Answer. 480 elements. To see this, note that the first row can be any vector (a,b) with 
a,b €Zs, except (0, 0) since the determinant must be nonzero. Thus there are 5-5 — 1= 24 
possible first rows of a matrix in GL(2, Zs). Once the first row is given, the second row 
cannot be a scalar (in Zs) multiple of the first row (as the determinant must be nonzero). 
Thus there are 5? — 5 = 20 possible second rows. This means that there are 24 - 20 = 480 
elements of this group. In 1832 Galois was the first to consider such groups. 
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, 1 2 
Exercise 2.3.5 Find the inverse of € ‘) in GL(2, Z7). How big is GL(2,Z7)? 


Theorem 2.3.1 (Facts About Groups G). The following hold for any group G: 


(1) The identity element of G is unique. 

(2) Inverses are unique. 

(3) Cancellation law. For elements a, x,y in group G, we have ax=ay implies x=y. 

(4) Solution of equations. Given a, b in a group G, you can always solve ax = b for x in G. 


Proof. 

(1) Suppose e and fare both identities in G. Then e=ef=f, first using the fact that fis an 
identity and then using the fact that e is an identity. 

(2) Suppose an element a € G has two inverses b and c. Then if e is the identity, ab = e= ca 
which implies 


b =eb=(ca)b= c(ab) =ce=c. 


(3) Multiply both sides of the equality ar= ay on the left by a~!. You need to use the 
associative law. 


(4) It is an exercise to prove this. A 
Exercise 2.3.6 Prove part (4) of the preceding theorem. 


Corollary 2.3.1 Every row of the multiplication table of a finite group G is a permutation 
of the first row. 


Proof. The map from row 1 to row k is 1-1 by the cancellation law. This map is onto by 
part (4) of the preceding theorem or by the pigeonhole principle. A 


The preceding corollary means that the multiplication table for a finite group G is a Latin 
square — as we saw already in Exercise 2.2.4. 


Definition 2.3.2 Assume G is a group. Define the left multiplication function L,: G— G 


by La(x) = ax, for xeG. 


Exercise 2.3.7 Show that the function La is 1-1 and onto. 


If you would like to think about a really large finite group of symmetries, you might want 
to look at the symmetries of the icosahedron, or symmetries of various chemical structures, 
or the group of motions of Rubik’s cube, or S, for large n, or GL(2, Z,) for large prime p. 
Infinite symmetry groups are as interesting - if not more so. 


Exercise 2.3.8 Which of the following groups are commutative? Give a reason for your 
answer. Assume that ne Z and n> 2. 


(a) Z2**, the 2 x 2 matrices with entries in Z, under addition; 
(b) the group of proper rotations of a cube; 
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(c) Z2, the 2-vectors with entries in Z, under addition; 
(d) Z2**, the 2 x 2 matrices g with entries in Z, and such that det(g) € Z*, under matrix 
multiplication. Here Z* denotes the unit group in Definition 2.3.1. 


Exercise 2.3.9 (Wilson’s Theorem). Show that if p is a prime, then (p—1)!= 
—1(mod p). 


Hint. Pair the elements of the unit group Z, with their inverses a—! for multiplication. 
There are only two solutions of a=a~' (mod p), namely 1 and —1 (mod p). Thus we can 
see that the product 2-3---(p—2) is congruent to 1 (mod p). 


Recall that Euler’s phi function ¢(n) is the number of elements of Z*. To compute it, 
you need to factor n into a product of distinct prime powers and use equation (2.4). The 
following exercises complete the proof of equation (2.4). 


Exercise 2.3.10 Show that if p and q are distinct primes then Euler’s function satisfies 


$(pq) = (p — 1)(q- 1). 


Exercise 2.3.11 Show that if m and n satisfy gcd(m, n) = 1, then Euler’s function satisfies 
o(mn) = b(m)¢d(n). 


Hint. First prove that the map sending x (mod mn) such that gcd(x, mn) = 1 to the ordered 
pair (x (mod m),x (mod n)) maps Z*,,, 1-1, onto Z*, x Z* = {(a,b) |ac Z* bE Z* }. This 
really comes from the Chinese remainder theorem (in Section 6.2). 


Exercise 2.3.12 Use the preceding exercise (and Exercise 2.3.3 to prove equation (2.4) for 
d(pi'p; +: ps), if the p; are pairwise distinct primes. 


2.4 Subgroups 


So let us talk about subgroups - that is, subsets of groups that are groups as well (using the 
same operations as in the big group). We have already seen examples, such as {I, F} C D3. 
Throughout the definitions in this section we will be envisioning a group with the group 
operation being multiplication. Of course, for many examples, the operation is addition. 

Before considering subgroups generated by a single element of a group, we should say a 
bit about integer powers of an element a in a group G. The rules are essentially the same 
as the rules for integer powers of real numbers. Define a° = e, the identity in G, and, for a 
positive integer n, define 

n 


a=a-a:a 
‘ee 


. 


Then define 
a aia"). tordln=1,2,9,,., 


One could also make the definition of a” inductively. 
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Proposition 2.4.1 (Facts About Integer Exponents are the Usual). Assuming a is an element 
of a group Gand r,s € Z, we have the following two facts. 


(1) aas=a'ts, 
Q) (fy =a". 


Proof. 
(1) If both exponents r and s are non-negative integers, this fact follows easily from the 
definition. You just have to think how many factors of a are on each side. If both r and 
s are negative integers, this fact is easily deduced from the definition in a similar way 
by counting how many factors of a~! are on each side. 
The only case that requires some effort is the case that one exponent is negative and 
one positive. If r=—m with m> 0 and s > 0, then 


m factors — s factors 
since, by the associative law, we can cancel a~'!a=e. If s >m, what remains is aS~™” = 


aS, If s<m, what is left is 


aor Gore — q's) -—qt. 


— 
i) 
~— 


The second fact about exponents can be shown as follows. If s is positive, then using 
the first fact (and mathematical induction), we have 


s terms 
h + 
C roe +r 
(a’) —q'.-.-d=a — qd" 
—— 


s factors 


If s =O, the result is clear as both sides are the identity. If s< 0, we can argue ina 
similar way using the definition and the fact that 


(a) =a", forall r. oS 


Exercise 2.4.1 Finish the proof of the second fact in Proposition 2.4.1 when s <0. 


Example: The Unit Group Z}. The group Z} consists of integers a (mod 7) such that 
ged(a,7)=1. The operation is multiplication modulo 7. Consider the powers of 2. You 
get 2°=1 (mod 7),2'=2 (mod 7),2” =4 (mod 7), 2*=8=1 (mod 7). The powers will 
repeat after this. Hence H= {1,2, 4 (mod 7)} is a cyclic subgroup of Z7 consisting of all 
powers of 2 (mod 7). A 


Exercise 2.4.2 Find all the powers of 2 in Z;, and Zi;. 


We have already been using the words “order of a group” in accordance with the following 
definition. But let us make it official. 


Definition 2.4.1 The order of a finite group G is the number of elements in G, denoted 


|G| or #(G). 
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Examples: Orders of Various Groups 


1. The group of integers mod n under addition: |Z,,| =n. 
2. Z* =the unit group of elements a of Z, such that gcd(a,n) =1 under multiplication 
mod n: 


|Z* | = d(n) =the Euler phi-function. 
3. D, =the dihedral group of motions of a regular n-gon: |D,,| =2n by Exercise 2.1.13. 


4. The symmetric group of permutations of n objects: |S,,|=n!=n(n—1)(n—2)---2-1 
by Exercise 2.1.13. A 


There is yet another use of the word “order” - not to mention that of Section 1.3. 


Definition 2.4.2 The order of an element a in a group G is the smallest positive integer 
n such that a" = e=identity in G. If no such n exists, then we say that a has infinite 
order. We will usually write |a| =the order of a. 


Example 1: The Unit Group Z;. The unit group Z7 consists of integers a (mod 7) such that 
gced(a,7) = 1. The operation is multiplication modulo 7. 

The order of 2 in Z? was shown to be 3 in the first example of this section. 

What about the order of 3 in Z7? The following computation shows that |3|=6 in Z;. 


3*=9 =2 (mod 7), 33? =6= 
) 


1 (mod 7), 3*=—3=4 (mod 7), 
3° =-9 =—2=5 (mod 7 = 


—6=1 (mod 7). 
Moreover, this shows that Z3 is a cyclic group generated by 3 (mod 7): 
Z; ={3" (mod 7) | ne Z}= {1, 2,3, 4,5, 6 (mod 7)} = (3 (mod 7)). A 


Example 2: The Dihedral Group D3. In the dihedral group D; of motions of an equilateral 
triangle, the order of R (which represents counterclockwise rotation by 120°) is 3 and the 
order of F (which is the reflection across an axis stretching from one vertex to the midpoint 
of the opposite side) is 2. A 


Example 3: The Additive Group Z¢. We investigate the orders of the elements of the group 
Ze under addition mod 6. Here the powers 

a"=a:--a 

Vw 

n times 


9 


for integers n > 0 become multiples 
n-a=a---a(mod6), forn>0. 
Neca ine? 
n times 


We list the elements and their orders in a table. The orders are all the divisors of the order 
of Ze. A 
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a = element of Ze under + mod 6 | |a| 
0 (mod 6) 1 
1 (mod 6) 6 
2 (mod 6) 3 
3 (mod 6) 2 
4 (mod 6) 3 
5 (mod 6) 6 


Example 4: The Additive Group Z. If xis a nonzero element of the group Z under addition, 
then the order of x is oo. To see this, note that for any positive integer n, we know that n- x 
is not zero. There are no zero divisors in Z. A 


Exercise 2.4.3 


(a) Find the orders of all elements of Z, under addition mod 8. 
(b) Do the same for the unit group Z§ under multiplication mod 8. 


Exercise 2.4.4 Find the orders of all elements of the unit group Z{,. Then do the same for 
the unit group Z73. 


So finally we define subgroup. 


Definition 2.4.3 A subgroup H of a group G is a subset HCG which is itself a group 
under the same operation as for G. If e denotes the identity of G, we will say that H= {e} 


is the trivial subgroup of G. If the subgroup H of group G is such that HE {e} and HE G, 
we will say that H is a proper subgroup of G. 


Example 1: Subgroups of the Additive Group Z;. Look at the subset H= {2n (mod 6)|n€ Z} 
of G= Ze, under + mod 6 under the same operation as in G. This is a subgroup. To see this, 
note that if 2n (mod 6) and 2m (mod 6) are in H then 2n + 2m =2(n+ m) (mod 6) € H, for 
all n, m€ Z. It follows that 0 € Z and that H is closed under both addition and subtraction. 
Thus H is a subgroup as the associative law certainly holds. A 


Example 2: Subgroups of the Dihedral Group D3. As in Section 2.1, D; = {I, R, R’, F, FR, FR”} 
is the dihedral group of motions of an equilateral triangle. Subgroups are 


Hi =({1, F}, Ho ={I, FR}, Ha = {I, FR’}, Ha ={1,R, R’}. (2.8) 
There are also two other so-called improper subgroups 
Hs={I} and He =D; itself. 


Note that the orders of the subgroups are 1, 2,3, and 6. These are all the positive divisors 
of 6. 

The collection of subgroups of D, forms a poset under the relation of C. We can draw the 
poset diagram of the subgroups of D3; as in Section 1.7. See Figure 2.17. Here an ascending 
line means the lower group is a subgroup of the upper group and there is no subgroup 
between the lower and upper groups. See Dummit and Foote [28] for many more examples 
of poset diagrams for subgroups of various groups. A 
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Figure 2.17 Poset diagram for subgroups of D3 as defined in (2.8) 


Exercise 2.4.5 


(a) Are there any proper subgroups of the dihedral group D3 other than those listed in (2.8)? 
Why? 

(b) Draw the poset diagram for subgroups of the Klein 4-group with the multiplication table 
in Figure 2.14. 


Exercise 2.4.6 Draw the poset diagram for the subgroups of Z,, under addition mod 12. 


Exercise 2.4.7 Draw the poset diagram for the subgroups of Z{, under multiplication 
mod 12. 


Exercise 2.4.8 Draw the poset diagram for the dihedral group Da. 


Example 3. The group Z, of integers mod n under+(mod n) is not a subgroup of the 
group Z of integers under +. Why? The two operations are different. For example 3 + 3= 
0 (mod 6) in Z, but 3 + 3= 6, which is not 0 in Z. A 


We have two tests for a non-empty subset H of G to be a subgroup of G under the 
operation of G. 


Proposition 2.4.2 (Two-Step Subgroup Test). Suppose that G is a group and HC G such that 
HAO. Assume that V a,b €H. It follows that a- b€ H and a~! € H. Then H is a subgroup 
of G. 


Proof. It is clear, by hypothesis, that His closed under multiplication. Moreover, the associa- 
tive law follows from that in G. Also, we have the existence of inverses in H, by hypothesis. 
But why does H have an identity? We know there is an ac Has HAO. We also know ae H 
implies a~' € H. Thus, by our hypothesis, with b =a’, the identity e=aa~' € H. A 


There is also a shorter test. 


Proposition 2.4.3 (One-Step Subgroup Test). Suppose that G is a group and HC G such 
that HA. Assume that V a,b € H it follows that a-b—! € H. Then H is a subgroup of G. 
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Proof. Since His non-empty, there is an element a ¢ H. Then e= a-a~'€ H, by hypothesis. 
Take a =e to see that b € Himplies e - b-! = b-! € H. Thena, b € H implies that a - (b-1)~' = 
a-b€H. So we are done by the two-step subgroup test. A 


There is a third test for a non-empty finite subset H of G to be a subgroup of G under 
the operation of G. 


Proposition 2.4.4 (Finite Subgroup Test). Suppose that G is a group and HCG, H finite, 
such that HE @. Assume that a,b €H => a-b€H. Then H is a subgroup of G. 


Proof. We use the two-step test from Proposition 2.4.2. Thus we need only show that a ¢ H 
implies a~! € H. If a= e=identity, then a~! =e and we are done. Otherwise look at the 
subset {4, a*,a?,...a",...} CH. Since H is finite, we must have a'= @ for some i>j> 1. 

Then ai = eand i “s 1. Therefore e¢ H and a-a'/~!=e implies a’S-1=a !CH A 


Definition 2.4.4 If a is an element of a group G, the cyclic subgroup generated by a is 
(a) ={a" | neZ}. 


It is not hard to see that (a) is a subgroup using the one-step test from Proposition 2.4.3 
and properties of exponents. Just look at the following calculation: 


aay =a] (0). 
Example 1: Cyclic Subgroups of the Additive Group Z,. In Z, under addition mod 6, we find 


that all the subgroups are cyclic (which is a general fact about cyclic groups to be proved 
in Section 2.5). Remember that a” becomes na for a€ Ze and an integer n. 


(0) ={n -0 (mod 6)|n €Z} = {0 (mod 6)} 

(1) ={n-1 (mod 6)|n €Z} = {0, 1, 2,3,4,5 (mod 6)} = (5=~—1 (mod 6)) = Ze 
(2) ={n-2 (mod 6)|n €Z} ={0, 2, 4 (mod 6)} = (4= —2 (mod 6)) 

(3) ={n-3 (mod 6)|n €Z} ={0, 3 (mod 6)} 


Note that the order of the element a in Ze, which was computed earlier in this section, 
is the same as the number of elements in (a). We prove this, in general, in Section 2.5. A 


Exercise 2.4.9 Are there any subgroups of the group Zg other than those listed above? Draw 
the poset diagram for Z¢. 


Example 2: Cyclic Subgroups of the Dihedral Group D3. Recall that, as in Section 2.1, the 
dihedral group of symmetries of an equilateral triangle is 


D3= {I, R, R’, F, FR, FR’}. 


The cyclic subgroups of D; are the ones we found in Figure 2.17 except for D; itself which 
is not cyclic: 
(F)={1, F}, (FR) = {I FR}, (FR’) = {1 FR’}, 
(1) ={E, (R) = {2 R, R?} = (R’). A 
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Next we look at the subgroup of elements that commute with everything in a group. 


Definition 2.4.5 The center of a group G is Z(G) ={aeG| ax=xa, Vx EG}. 


Proposition 2.4.5 The center of G is a commutative subgroup of G. 


Exercise 2.4.10 Prove the preceding proposition using the one-step subgroup test from 
Proposition 2.4.3. 


Example 1: The Center of the Dihedral Group D3. We claim that the center of the dihedral 
group D; is Z(D3) = {I}. To see this, look at the group table in Figure 2.9 of Section 2.2. 
No row equals the corresponding column except the first row and column corresponding 
to the identity I. A 


Example 2: The Center of the Dihedral Group D,. We can show that the center of the dihe- 
dral group D, is Z(D,) = {I, R?}. Here R means rotate counterclockwise 90°. To generate 
D,, we need R and F, a flip about an axis through a vertex and the point at the center of the 
square. To see that the center of D4 is what we claim, we need to multiply a few elements 
of D,. Using permutation notation, labelling the vertices of the square 1,2, 3, 4, we see that 


12 3 4 123 4 
R= and F= ; 
2 3 4 1 14 3 2 


One finds then that FR=R~'F. This is one of our defining relations for D,. The others 
are R*=IJ and F’=L The center of D, is contained in the center of the subgroup (R) = 
{I, R, R’, R?}. This means the center of D, is a subgroup of (R). The elements of D, are 
{I, R, R?, R°, F, FR, FR’, FR?}. Now FR' = R~'F. In order for FR' = R'F, for i€ {1, 2, 3,4}, we 
need R' = R-'. That requires 2i= 0 (mod 4) which implies i=0 (mod 2). This means that 
FR’ = R’F is the only non-trivial possibility. Thus the center of Dy, is as stated. A 


Exercise 2.4.11 Find the centers of the dihedral groups Ds and Dg. 


Hint. The center of any group is contained in the center of any of its subgroups. So the 
center of D, is contained in the subgroup generated by R, the counterclockwise rotation 
through an angle of 2x/n. But we can also prove that if F is the flip about an axis through 
a vertex and the center of the n-gon, then FR=R~'F. It follows that FR'=R'F can only 
happen if 2i=0 (mod n). 


Example 3: The Center of the General Linear Group GL(2, IR). We find the center of the 


general linear group GL(2, F), where F = R = the (field of) real numbers. Here, as in equation 
(2.5) of Section 2.3, where R was replaced with Z,, for prime p 


GL(2, R) -{ (‘ ‘) 


a,b,c,de€ R, ad— boro. 
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The group operation is matrix multiplication. The center of GL(2,R) is 


Z (GL(2, R)) = {t= & ’) | acR,a& of. (2.9) 


To prove this, you just have to see which matrices commute with all matrices in GL(2,R). 
Let a,b, c,d be the entries of a matrix in the center of GL(2, R). Then for all u, v, we R with 
uwF 0 we have 


Coben 


Doing the multiplications we get 


It follows that au =au-+ cv for all v in R - in particular, for v= 1. Thus c=0. Similarly 
b=0 can be seen by looking at 


(al w= w) lo a} 


This implies 


au+bv bw\ (ua ub 
dv dw) \va */)° 
So au + bv=au. Taking v=1 gives b=0. 
Thus our matrix must be diagonal. To see that the diagonal entries must be equal, look at 
the preceding computations again, now with b= c=O. You see that dv= va. Taking v=1 


implies d=a. So our matrix is al, where J is the identity matrix. 
Note that our arguments also work for matrices with entries in F=Z,, p a prime. A 


Exercise 2.4.12 For prime p, find the center of the affine group 


ann-{(6 9 


where the group operation is matrix multiplication. 


a, bE Zp, aez;}, (2.10) 


Exercise 2.4.13 


(a) Suppose that H and K are subgroups of G. Show that HK is also a subgroup of G. 
(b) Is the same true for the union of subgroups? Why? 


If you try to write down a multiplication table for an abstract group G of order 4 that 
is not cyclic, you will be forced to write down a table like that below. What is the main 
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property of the table? Every element has order 2 or 1. No element has order 3 or order 4. 
So the multiplication table looks like: 


| - ej/al|bic 
iE: e/al|bic 
| aljelc 
|b |b e 
| ¢ c e 


Note that the main diagonal consists of e, the identity, since all elements +A e have order 
2. What is ab? There is only one choice ab=c. Why? If ab= b= eb, for example, then by 
cancellation a=e. But a€ e. That is true since our group has order 4. 


Exercise 2.4.14 Consider an abstract group G of order 4 that is not cyclic. Explain why G 
cannot have elements of order 3 or 4. Complete the table above and show that the result 
gives a group. Do you recognize it? Later we will have Lagrange’s theorem to make this 
problem easier. See Section 3.3. 


Exercise 2.4.15 State whether each of the following is true or false and give a reason for 
your answer. 


(a) The unit group Z* (with the operation of multiplication mod n) is a subgroup of the 
group R* =R — {0} (under multiplication). 

(b) The affine group Aff(p) for prime p defined in equation (2.10), is a subgroup of the 
general linear group GL(2,Z,) defined in equation (2.5). Here the group operation is 
matrix multiplication for both Aff(p) and GL(2, Zy). 

(c) It is possible to view the dihedral group Dy, as a subgroup of the symmetric group Sy. 


One of our goals should be to get to know the small groups really well. Ultimately, we 
should know how many different groups G with |G| < 15 exist - up to isomorphism, a word 
meaning mathematically identical, which will be defined in Section 3.2. And one should be 
able to write down Cayley tables for such groups. Group Explorer will help in this endeavor 
certainly. For prime orders, the only groups are cyclic. That is our next topic. However, if 
we were chemists, we would need to know much more about the 230 space groups (219 if 
you do not distinguish between mirror images). These are discrete subgroups of the group 
of Euclidean motions of 3-space: that is, the group generated by rotations and translations. 
Crystallography is based on the study of such groups. See Gallian [33, Chapter 28] for a 
discussion of the 17 crystallographic groups in the plane - also known as wallpaper groups. 
Wikipedia has a long article on space groups listing the numbers in dimensions up to 5. 


2.5 Cyclic Groups are Our Friends 


The cyclic groups are the simplest, thus our friends. They are actually old friends - as 
we met them a long time ago in Chapter 1. They are always Abelian (but the converse 
is not true). And they have many applications as we shall see in Chapter 4. Some appli- 
cations come from thinking of large finite cyclic groups as good approximations for a 
circle - thus the name “cyclic.” They might be called the groups of déja vu or “what goes 
around comes around.” If you keep multiplying a given element a of a finite multiplicative 
cyclic group by itself, you eventually get back to the identity and then the whole cycle 
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repeats: e,a,a’,...,a"=e,a,a’,... I discuss many applications of cyclic groups in my 
book [116] and we shall see many before we reach the end of this story. The fast Fourier 
transform is one such application. It allows rapid signal and data processing. 

Other groups are more complicated, especially the non-Abelian ones. The reason may be 
that - as J. H. Conway says - such groups “are adept at doing large numbers of impossible 
things before breakfast.” 

The following definition says what it means for a group G to be a cyclic group under 
multiplication. 


Definition 2.5.1 A group G is cyclic means G=(a)={a" | mE Z}. We call a the 


generator of G. If the order of the cyclic group is n, we will call it Cy. 


Another way to think about this is to say that the multiplication table for the group can 
be identified (by clever relabeling) with that for Z,, under addition mod n. 


Example: The Multiplication Table for an Abstract Cyclic Group G of Order 10 under 
Multiplication. See Figure 2.18 for this table. The multiplication table for a cyclic group of 
order 10 generated by a is really the “same” as that of Zio under addition mod 10. More 
precisely, this means the following. You identify e with 0 (mod 10), a* with x (mod 10). 
Define a finite logarithm function fby 


f(a’) =x (mod 10). 


Figure 2.18 The Group Explorer version of the multiplication table for a cyclic group of order 10 


This is a well-defined 1-1 function since (as will be checked in the exercise below) a’ = @ iff 
x=y (mod 10). The function also maps the order 10 cyclic group (a) onto Zo. Finally one 
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can check that f(c) + f(d@) =f(cd) (mod 10). Later (see Definition 3.2.1) we will call such a 
function f: G— Z,, a group isomorphism. A 


Exercise 2.5.1 Check that if G is a cyclic group of order 10, G= (a), then f(a*) =x (mod 10) 
is a well-defined function and f(c) + f(d) =f(cd) (mod 10). Show that f maps G 1-1 onto 
Zo. Then find the orders of a’ and @ in G. 


Recall that the order of a finite group means the number of elements in the group. When 
we study groups of order n, we need only look at groups that are really different - that is, 
non-isomorphic groups. In particular, we should not distinguish between the two groups in 
Exercise 2.5.1. Thus (up to isomorphism) there is really only one group of order 1. There is 
only one group of order 2 and only one group of order 3. There are two groups of order 4, 
one group of order 5, and so on. We will be able to prove such things soon. See Section 4.5 
for our table of groups of order < 15. There are longer tables of small groups in books and 
on the internet. You can also use the Group Explorer program to see a list of small groups 
and to find multiplication tables, Cayley graphs and other information about them. In the 
old days I looked at a book by A. D. Thomas and G. V. Wood [120]. It lists the multiplication 
tables for all groups of order < 32, except the cyclic groups. Another possibility is to use 
SAGE as explained in Robert A. Beezer’s exercises in Thomas W. Judson’s on line book [50]. 

Our goal in this section is to prove the main theorem about cyclic groups, which is 
Theorem 2.5.1 below. Before we can prove this theorem, we need more information about 
computation in cyclic groups. 


Facts About Powers in Finite Cyclic Groups 


Suppose that G is a finite group and a€ G. Suppose that a has order n = |a|, meaning that 
the positive integer n is the smallest such that a” =e= identity of G. Then we have the 
following facts. 


1. a’ =a) => i=j (mod n). Multiplication in G corresponds to addition of exponents 
mod n. 


2. (a) ={a,a’,a?,...,a"~!, a" =e}. The elements in the set are distinct and thus the two 


ways we use the word “order” are in agreement. The order n of a is the same as the order 
of the cyclic group (a). That is, 


|(a)| = lal. 


3. a = e=identity > n divides k —> k=0 (mod n). 


Proof of fact 3. This is just a special case of fact 1. So we will say no more about fact 3. 
Proof of fact 1. <=. Suppose i=j (mod n). This means i—j=n-q for some integer q. 
Then, by Proposition 2.4.1, 


a’ J =a™=(a")1=el=e. 


It follows that a'/ =e and then that a' = a/ upon multiplying both sides by a’, again using 
Proposition 2.4.1. 
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—-. Suppose a'= a. Then multiply both sides by a/. Using Proposition 2.4.1, this gives 
a'j =e, the identity. We want to show that n= |a| divides i — j. To do this, we apply the 
division algorithm to i —j to get 
i—j=nq+r, withO<r<n. 
The proof will be complete if we can show the remainder r= 0. We know that a" =e and 
e=aJ=a"tt "= (ala =a’. 
So e= a’. But if r were not 0, this would contradict the definition of |a| =n, the smallest 
positive integer such that a” = e. Thus r= 0 and i=j (mod n). We leave the statement about 
multiplication as an exercise. 


Proof of fact 2. If n is the order of a, and we look at the powers a’, j > 1, the first power 
to be the identity is the nth power. Moreover, the elements of the set 


Vaal Oo evcgG oh =e} 


must be distinct by fact 1. For no two distinct elements in the set {1, 2,3,...,} can be 
congruent mod n. Why? 


Exercise 2.5.2 


(a) Show that if a is as in the preceding facts about powers in cyclic groups, then aid =a‘, 
where i+ j=k (mod n). 
(b) Answer the “why?” in the proof of fact 2 above. 


Morals of the Three Facts About Powers 


It is easier to compute in Z, under addition mod n than it is to compute in the multiplicative 
group of powers of the element a. That is, one should take logs. In the old days (when I was a 
teenager) we learned to multiply real numbers using log tables which change multiplication 
to addition. The slide rules we used when I was a college student had the same effect. They 
were mechanical log tables and they made a horrible noise when hundreds of students 
were using them on a physics exam. Click click click ... Anyway, the same principle works 
in cyclic groups. Take logs and compute with the exponents i (mod n) rather than a’. 
Discrete logarithms in finite cyclic groups have applications in cryptography. The discrete 
log problem concerns the speed of computing discrete logs. 

Another moral is that for a finite cyclic group (a) the Cayley graph X((a) , {a,a~'}) is 
a finite circle. See Figures 2.19 and 2.20 for the case that the order of (a) is 10. 


Example. Suppose the order of a is |a| =n. How many elements of 
(a)={a,a’,...,a°*,a"=e} 


are squares? A 


Answer. An element b=a* of (a) is a square if there exists an integer power y with b = 
a‘ = (a)*. By fact 1 about powers in cyclic groups, a* = (a”)* is solvable iff, given k, there 
exists a y solving the congruence 


k =2y (mod n). (2.11) 
Now we have a linear congruence to solve rather than an abstract non-linear equation 


in some multiplicative group. The congruence in (2.11) will have a solution y for a given k 
iff gced(2, n) divides k. This follows from Exercise 1.6.9. 
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Figure 2.19 Cayley graph X((a), {a,a~'}) for 
a cyclic group (a) of order 10 


Figure 2.20 A less boring picture of a 10-cycle 


So we have two cases. 


Case 1. If n is even, we will only have a solution for even exponents k modulo n. That is, 
only half of the elements of (a) will be squares in this case. 
Case 2. If n is odd, then everything in (a) is a square. 


The following results involve generalizing this example from squares to kth powers. 


Three More Facts About Powers in Cyclic Groups 
Again assume |a| =n in a finite group G. Then we have the following three facts. 


4. (a®) = (a) iff gcd(k, n) = 1. 
5. (at) = (qsed(n.k) \ 
Jat] = —__" 


cd(n, k) 


Proof of fact 4. This is just a special case of fact 5. 
Proof of fact 5. To show the set equality, we use our logarithm argument mentioned earlier. 
So begin by noting that x € (a*\ is equivalent to saying that x= a’, where 


b=yk (mod n), (2.12) 
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for some y€ Z. By Exercise 1.6.9, we know that we can solve (2.12) for yiff gced(n, k) divides 
b. This says that x=ake (ascd(nk)), 

Proof of fact 6. We leave this as an exercise. Again use the logarithm method to switch the 
problem to one involving linear congruences. 


Exercise 2.5.3 Prove fact 6 above. 


Hint. The multiplicative order of a* in (a) is the same as the additive order of k in Z,. Set 
d=gcd(k,n). You need to show that kr=0 (mod n) iff 5 divides x. 


Now finally we can prove the main theorem about cyclic groups G in the case that G is 
finite. Part (1) also holds if G is an infinite cyclic group. See Exercise 2.5.8. 


Theorem 2.5.1 (Main Theorem on Cyclic Groups and Their Subgroups). Suppose G is a 
finite cyclic group of order |G|. 


(1) Any subgroup H of G is cyclic. 

(2) The order |H| of subgroup H of G must divide the order |G}. 

(3) For every divisor d of |G|, there is a unique subgroup H of G such that |H|= d. If 
G= (a), then H= (a"/*). 


Proof. 


(1) Suppose G= (a). Let m be the smallest positive integer such that a” € H. Then (1) is a 
consequence of the following claim. 

Claim. H = (a”). 

Proof of Claim. By the division algorithm, if a’ € H, we have t= mq + rwithO<r<m. 
But then by Proposition 2.4.1, a’ =a ™4= a'(a")—4€ H, contradicting the definition 
of m unless r=0. Thus H must equal (a”). QED Claim. 

Using part (1) of this theorem, we know that H= (a™). Facts 2 and 6 in our list of facts 
about powers in cyclic groups preceding this theorem say 


(2 


— 


n 


if G= (a) and |a|=n, then |H|=|a™|= cau 


Thus |H] certainly divides n. 
Suppose d divides n = |a|, where G= (a). Then, by fact 6 in our list of facts about powers 
in cyclic groups preceding this theorem, a subgroup of G of order dis A= (a"/ 8 

We should explain why this subgroup A = (a"/ - is the only subgroup of G with 
order d. Suppose H is another subgroup of G with |H|= d. We know by (1) and (2) that 
H= (a™) for some m. Then, by fact 5 in our list of facts about powers in cyclic groups 
preceding this theorem, we have H= (goed(mn) Let gcd(m,n) =k. Then k divides n. 
Now H = (a*) and |H| =d imply (by fact 6) that 


—_ 
bo 
—S 


n n 


d='ai= = 
I Scd(n,k) 


since k divides n. It follows that 4 =k. This proves A = (a"/4) = (ak) =H. A 
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We will see in Section 3.2 that part (2) of Theorem 2.5.1 holds for all finite groups - not 
just the cyclic ones. 


Example. The poset diagram (as in Section 1.7) for the subgroups of a cyclic group of order 
24 under the relation H C K is now easy to draw. It is the same as the poset diagram of the 
subgroups of the group Z4, under addition (mod 24). See Figure 2.21. A 


Zo4= (1 mod 24) under Figure 2.21 Poset diagram of the subgroups of Z24 under addition 
addition mod 24 


(2) (3) 


(4) (6) 


(8) (12) 


(24 = 0) 


If one wants to draw the poset diagram for a cyclic group of order n, then the more 
primes dividing n, the higher dimensional the diagram. Here there are only two primes and 
so the diagram is two-dimensional. The diagram is also the same as that for the divisors of 
24 under the relation a|b, except that up and down are reversed. See Figure 1.13. 

In these poset diagrams we only draw a line from the group above to the subgroup directly 
below. For example, (0) is a subgroup of (4), but we only draw a line from (8) down to 
(0). The line is drawn between subgroups A C B iff there is no subgroup of G between A 
and B. 


Exercise 2.5.4 Draw the poset diagram for the subgroups of the cyclic group of order 60. 
There are three primes so the diagram is three-dimensional. 


We end this section with a formula counting the number of elements of order d in a 
cyclic group G if d divides the order of G. 


Theorem 2.5.2 Assume G= (a), where |G|=|a|=n is finite. Then, if d divides n, the 
number of elements of order d in G is ¢(d), the Euler phi-function (which is the order of 
Z*, or the number of integers a between 1 and d with gcd(a,d) = 1). 


Proof. By part (3) of Theorem 2.5.1, there is a unique subgroup H= (b) of G with |H| =d. 
Then every element of order din G must generate H. Fact 4 in our list of six facts about 


powers in cyclic groups says that H = (b*) iff gcd(k, d) = 1. There are $(d) such numbers 
k modulo d. A 
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Example 1. Suppose p is a prime. Then ¢(p) =p — 1, since all integers a between 1 and 
p— 1 have ged(a, p) = 1. The only positive divisors of p are 1 and p. Then the cyclic group 
Zp, under + (mod p), has p— 1 generators a (mod p),a=1,2,...,p—1. A 


Example 2. It turns out that when p is prime, Z> = {1,2,...,p — 1 (mod p)}, under multi- 
plication mod y, is cyclic. This will be Exercise 6.3.10. However not all groups Z* are cyclic. 
For example Z; and Zj, are not cyclic. The theorem on the subject says that Z> is cyclic 
iff n= 2,4, p", or 2p’, where pis an odd prime. See Kenneth Rosen [91], Daniel Shanks [103, 
pp. 92ff], or Ramanujachary Kumanduri and Cristina Romero [62, p. 173]. A 


The group Z; has order p—1 and thus has ¢(p—1)=¢(¢(p)) generators. In the 
following examples, we list the smallest positive generator. 


Zs = (2 (mod 5)),Z7 =(3 (mod 7)), Zj; = (2 (mod 11)), Z}3 = (2 (mod 13)), 
Zi, = (3 (mod 17)), Zi, = (2 (mod 19)), Z5, = (5 (mod 23)), Z5, = (2 (mod 29)). 


Finding generators of Z> is so useful in number theory that many books on the subject 
include tables giving the generators g of Z> for small values of p. These generators are called 
“primitive roots” mod p in number theory. Finding them by hand can be time consuming. 
Mathematica will find them for you. The number 2 works about 37% of the time according 
to a famous conjecture of E. Artin from 1927 which remains unproved though there is much 
evidence for the conjecture. See Shanks [103] for more information. 


Exercise 2.5.5 State whether 2 is a primitive root mod p: that is, Z, =(2 (mod p)) for all 
the primes p < 100. 


Exercise 2.5.6 State whether each of the following statements is true or false and give a 
brief reason for your answer. 


(a) A cyclic group can have only one generator. 
(b) Any Abelian group is cyclic. 
(c) The group G={a,b,c,d} of order 4 with the following multiplication table is cyclic. 


x |la]|bje|d 
alla|bic|d 
b||b)/a|ld|ie 
c|le |d|bja 
diid|cl}ajsb 


Exercise 2.5.7 Find all the subgroups of the group Z of integers under addition. 


Hint. Use well-ordering. 
Exercise 2.5.8 Show that any subgroup of an infinite cyclic group is cyclic. 


Exercise 2.5.9 Suppose G= (a) is an infinite cyclic group. Show that if two of its subgroups 
are equal, namely, (a") = (a5) for r,s€ Zt, then r= +s and conversely. 
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Exercise 2.5.10 Let a be an element of a multiplicative group. Suppose the order of a is 
|a| =n. How many elements of (a) = {a,a’,...,a"~1, a" =e} are cubes? 


Exercise 2.5.11 State whether each of the following statements is true or false and give a 
reason for your answer. 


(a) The multiplicative group Z% is a subgroup of the multiplicative group Q* =Q — {0}. 
(b) Q is an additive subgroup of the additive group R. 
(c) The additive group Z,75 has a subgroup of order 3. 


There are some other interesting figures that can be associated to a group: cycle diagrams. 
These diagrams have as nodes every group element a and show all the cyclic subgroups 
(a) ={e,a,a’,...,a"~'} as cycle graphs. The cycles (a) and (b) will be interconnected if 
one element is in the cyclic subgroup generated by the other. So, for example consider 
the cycle diagram for the multiplicative group Z}, in Figure 2.22 below. The cycles are 
{1, -1= 14}, {1, -4= 11}, {1, 2, 4,8}, {1, -2= 13, 4, -8=7}. See Shanks [103, pp. 87ff] 
for many more such figures. Group Explorer will also create these diagrams. 


Figure 2.22 Cycle diagram in the multiplicative group Zi; 


Exercise 2.5.12 Draw a cycle diagram for Z3,. Then do the same for Z>,. 
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3.1 Groups of Permutations 


We have already seen that permutation notation allows us to compute more easily in the 
symmetry groups of various geometric figures such as D,. The symmetric group of permu- 
tations of n objects is an extremely important group for many reasons - not least of which 
being the applications in computer science. It also is an extremely large group for large n. 
We have already said many things about this group. Let us review them first. 


Definition 3.1.1 The symmetric group is 


Sni={o:{1,2,3,...,n}— {1,2,3,...,n} | o is 1-1 and onto}, 


for n=2,3,4,5,... Multiplication is composition of functions. 


Throughout this section we will assume n> 2 for obvious reasons. 
A permutation o is uniquely determined by its list of values and we shall write 


It follows from this representation that there are n! = n(n — 1)(n— 2)---2- 1 elements of 
Sn as We showed in Section 2.3. 

In Section 3.2 we will see that every finite group G of order n can be viewed as a subgroup 
of S,,. This is Cayley’s theorem - proved in 1854. Subgroups of S, are known as permutation 
groups. Thus every finite group can be viewed as such a permutation group. We saw in 
Section 2.1 that the dihedral group D3; of symmetries of an equilateral triangle can be 
viewed as S;. The dihedral group D, is similarly considered as a proper subgroup of S4. For 
the order of D, is 8 while the order of S, is 24. Similarly D,, is viewed as a proper subgroup 
of S, for n > 4, since D, permutes the vertices of a regular n-gon. In fact, we have shown in 
Exercise 2.1.13 that the order of D, is 2n. The order of S, grows large quite rapidly. Some 
examples are |S5| = 120, |Ss| =720, |S52| is approximately 8 times 10°’. 

The last group, Ss2, is of interest for a card player. Statisticians study S52 to learn how 
many shuffles it takes to put a deck of cards in “random” order. See Diaconis [24] for exam- 
ple, which includes many other applications as well as a discussion of Fourier analysis 
on the symmetric group - a subject too intricate to include in our text, where we con- 
sider only the cyclic groups and the Abelian group Z5 in Section 4.2 and Exercises 8.2.15, 
8.2.16 respectively. There are applications of the symmetric group S,, in a variety of fields, 


82 


Part | Groups 


for example: computer science, economics, psychology, statistics, chemistry, physics. Math- 
ematica will do computations in the symmetric group. So will SAGE. See Beezer’s SAGE 
exercises in Judson [50]. 


Disjoint Cycle Notation for Permutations. Let us define the cycle notation by example. 
Consider the permutation 


This permutation sends 1 to 5 and 5 to 1. In addition, it sends 2 to 3, 3 to 4, and 4 to 2. So 
we write 


o = (15)(234) = (234)(15). 


The cycles (15) and (234) are said to be disjoint since each acts on disjoint sets of num- 
bers, namely {1,5} and {2, 3,4}. Since the cycles (15) and (234) act on disjoint sets of 
numbers, the cycles commute: that is, (15)(234) = (234)(15). 


Exercise 3.1.1 Find the disjoint cycle decomposition for every permutation in S3 and S,. 
Exercise 3.1.2 Ifo =(a,a2--+ dp) is a cycle in S,, show that the order of o is |o|=k. 
Exercise 3.1.3 Show, by example, that non-disjoint cycles need not commute. 

We will need the following definition to compute orders of permutations written in 


disjoint cycle notation. 


Definition 3.1.2 The least common multiple of two positive integers m,n, written 


Iem[m, n] = [m,n], is the smallest positive integer that is a multiple of both m and n. 


Examples. Icm[2,3] =6, lem[4,20] = 20. A 


Note that one can give a definition of least common multiple which is similar to the 
one we gave for the greatest common divisor of two integers. That is, one can show that 
Icm[m, n] = c iff the following two properties hold: 


(1) both m and n divide c; 
(2) m divides c’ and n divides c’ —>c divides c’. 


Exercise 3.1.4 Prove this last statement. 
Exercise 3.1.5 Compute lcm[13, 169] and lcm[11, 1793]. 


Exercise 3.1.6 Show using unique factorization into primes that we can compute the lcm 
as follows, once we have factored the integers involved as a product of the pairwise distinct 
primes pi, i=1,...,kR: 


k k k 
Icm The Te =|[P7. where h; = max{e;, fi}. 
i=l i=l 


i=1 
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Exercise 3.1.7 Prove that, for integers n,m, not both 0, we have 
nm 


Iem[n, m] = TCE 


Exercise 3.1.8 Extend the definitions of gcd and |cm to sets of three or more integers. Do 
the formulas in Exercise 1.5.23 and the last two exercises extend? 


Proposition 3.1.1 (Facts About Permutations). The following hold: 


(1) Every permutation in S, can be written as a product of disjoint cycles. Here “disjoint” 
means that the numbers moved by each of the cycles form pairwise disjoint sets. 

(2) Disjoint cycles commute. 

(3) The order of a product of disjoint cycles is the least common multiple of the cycle 
lengths. 


Proof. 


(1) Given o €S,, choose a€ {1,..., n} and look at the orbit of a under the cyclic subgroup 
(a) Spi 


Orb(a) = {o'a | i=1,2,...,|o|}. 


From the orbit, we get a cycle permutation (a, 07a, 07a,...,o!°!~1a). If the set of inte- 
gers represented by the orbit of a under (c) is not all of the set {1,...,}, choose an 
element b not in the orbit of a. The orbit of b will give rise to another cycle permutation 
(b,07b, o°b,...). Keep going in this way. The orbits partition {1,2,...,n} into a finite 
union of sets 


{1,2,...,n} = Orb(a) U Orb(b) U--- U Orb(z). 
Corresponding to this partition, we get the disjoint cycle decomposition of o: 
c= (a, oa,o4,.. .) (b, ob, 0°b,.. 4 ee (2 O72,0°2,.. ae 


(2 


— 


Since the cycles act on disjoint sets of numbers, the ordering of the cycles does not 
matter. 

(3) Suppose that a and £ are disjoint cycles. Then, since they commute by part (2), we see 
that (a3 )k = ak Bk = e<—>a* =e and Bk =e. The result is a consequence of this fact. 
See the exercise that follows. A 


Exercise 3.1.9 Finish the proof of part (3) of Proposition 3.1.1. 


Example. What is the order of o if 


Answer. The order of o is 6 = Icm/2, 3]. 
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Check. 
1 || (12)(354) = (354)(12) 
2 || (12)?(354)? = (354)? = (345) 
3 || (12)7(354)? = (12) 
4 || (12)*(354)* = (354) 
5 || (12)5(354)> = (12) (345) 
6 | (12)°(354)° =(1) 


Question. What is o~!? 


To figure this out, recall our other notation for oc. 


1 2 yas n 
o> 3 
o(1) of2) +--+ a(n) 
To find the inverse function, just switch row 1 and row 2 to get 


eta (20) 9) ate) 


1 2 ots n 


Then we should reorder the columns to put the numbers on the top row in the usual order. 


Example. Consider 


12 3 4 5 
= = (12)(354 
J € 15 3 :) ae 
and find that 
4. 42°. SE Se 
(on => a 
12 3 4 5 
Reordering the columns gives 


ta() : ; : 3) = (12) (045). 


Another way to find the inverse is to use the fact that o is a product of disjoint cycles. 
So then we just need to know the formula for the inverse of a cycle. One answer is to raise 
the cycle to the power p = (length of cycle - 1). Another answer is to reverse the order of 


the numbers in the cycle: 
-1 
(@1d°++@,) = (GpQy_1 +++ 4241). 
For our example, using the fact that disjoint cycles commute, we get the inverse of 


o = (12)(354): 


a! =(12)71(354)~! = (12)(354)? = (12)(453) = (12)(345). A 


Exercise 3.1.10 tet r= (5 Se ee re ‘) 


5 43 271 6/7 


(a) Find the disjoint cycle decomposition of T. 
(b) Find r—', 
(c) Find the order of the permutation T. 
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Next we want to define even permutation. This will allow us to define the alternating 
groups. It also arises in many contexts — for example, in the definition of the determinant, 
as we shall see in Section 7.3. 

In order to define the notion of even permutation, we need to be able to write any 
permutation o as a product of transpositions (ab), also known as 2-cycles. If you get 
an even number of transpositions, then o is an even permutation. Otherwise o is odd. 
By Lemma 3.1.2 below, even though there are many different products of transpositions 
representing a given permutation o, the numbers of transpositions occurring must be the 
same mod 2. Thus the evenness or oddness of o is a well-defined concept. 

In order to write an arbitrary permutation o as a product of transpositions or 2-cycles, 
we just need to write any cycle as a product of transpositions since o is a product of cycles. 
This decomposition of o as a product of transpositions is not unique. First consider an 
example. 


Example. We can write 
(12345) = (15)(14)(13) (12) 
(12345) = (12)(23)(34)(45). 


To see that this works, you just have to see where both sides send x € {1, 2,3, 4,5}. Make 
a table: 


15)(14)(13)(12)x | (12)(23)(34)(45)x 


( ( 
2 2 
3 3 
4 4 
5 5 
1 1 


Uf od) Wl DM] Re] S&S 


It follows that (12345) is an even permutation. In general, a cycle of odd length is an even 
permutation. A 


Lemma 3.1.1 Every transposition can be written as a product of an odd number of adjacent 
transpositions, that is, those of the form (a a+ 1). 


Instead of proving the lemma in general, we give an example. 
Example. (14) = (12)(23)(34)(32)(21). A 


Exercise 3.1.11 Check the preceding example and then find the analogous decomposition of 
(13) and (27) in S7. What is the general formula for (ab) with a<b? 


Lemma 3.1.2 The even or oddness of a permutation is well defined. Equivalently, sup- 
pose 0 =Q1Q2°+-a,=(1B2--+ Bs, where a; and 8; are transpositions for all i,j. Then 
r=s (mod 2). 
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Our discussion of this lemma follows that of Birkhoff and Maclane [9, p. 145]. Before 
attempting a proof of this lemma, we need a certain useful polynomial. 


Definition 3.1.3 We can associate the very useful polynomial in n indeterminates to our 
problem. Define 


Vin, ie 
1<i<j<n 


Define the action of permutation o € S, on the polynomial by 


(oV) (41, %2, ee Xn) = V(4o (1); Xo (2), we ,Le(n))- 


Birkhoff and Maclane show (see [9, p. 448]) that our very useful polynomial also occurs 
when one solves cubic polynomial equations using repeated roots. The polynomial appears 
in the formula for the discriminant of higher degree polynomials over Q. If you wish to 
avoid this polynomial, you can find a different proof of Lemma 3.1.2 in Gallian [33]. We 
call the x ,..., %, indeterminates - not variables - because we are not viewing polynomials 
as functions. In Section 5.5 we will discuss this distinction again. 


Exercise 3.1.12 Does oT V=o(rV)? 
From the definition of V(1,...,2,), we see that we have a definition of the sign of a 
permutation, sgn(o): 
V4.1) )%o(2)1 +++ 1 Lo(ny) =SEN(o)V(41,---,4%,), Where sgn(o) =+1. 
Lemma 3.1.2 will be proved if we can show that 


1, o even 


senie)— i o odd. ee) 


Example. In the case n=3, our polynomial is: V(x ,+2,%3) = (41 — 4%2)(41 — 43) (42 — 43). 
Then if o = (12), we have 


(12) V) (ay, 42,23) = Ge — 21) ty — 2) (ay — 2) = — Vi, 2, 49). 
If o = (13), then 


((13)V) (41,42, 43) = (43 — 42)(43 — 41) (42 — 41) =(-1)?V(11, 22, 43) 
= —V(x1,%2, 43). A 


The following proposition implies equation (3.1) for sgn(c). This will complete the proof 
of Lemma 3.1.2. 


Proposition 3.1.2 (Properties of sgn(c)). Suppose that a, 6,0 are in S,. Then we have the 
following properties. 


(1) sgn(a3) = sgn(a)sgn(f). 
(2) sgn(tranposition) = —1. 
(3) sgn(c) = (- 1 jer emp oeiions ino) | 
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Proof. 


_ Vitop(1)r-+-%ap(n)) _ Va(a(1))1++-2a(a(m))) Vtg a,---*a(m) 
V(a1,.-+;4n) V(xg(1),--- XB (n)) V(41,.--54n) 

= sgn(a)sgn(/). 

(2) From Lemma 3.1.1 above and part (1), we just need to consider adjacent transpositions. 
Any transposition like (a a + 1) replaces the term (14 — %a41) with its negative. The 
other affected terms of V are of the form (x, — x;) and (x,,, — x;) with j>a-+ 1. These 
do not change the sign as they are just interchanged. Thus sgn(a a+ 1)=-—-1. 


(3) This follows from parts (1) and (2). A 


(1) sgn(af) 


Now - at long last - we can consider the alternating groups. 


Definition 3.1.4 The alternating group is A, ={o €S, | o is even}. 


Proposition 3.1.3 (Facts about the Alternating Group). 


(1) An is a subgroup of Sn. 
|S,| nl 
(2) |An| = = 


2 2° 
Proof. 


(1) Use the finite subgroup test. So you just need to see that the product of two even 
permutations is even. Suppose 0 = a1Q2--- a, and T= §,82---(;, where the a; and 
8; are all transpositions. Here both r and s are even. Then o7 is the product of r+ s 
transpositions, and r+ sis also even. 

(2) There are as many even permutations as odd ones in S,. For we have a 1-1, onto 
function 


T:An—7Sn—An defined by T(c) = (12)c. A 
Exercise 3.1.13 Do the odd permutations in S, form a subgroup of S,? Why? 
Example: The Tetrahedral Group. Tet=A, is now our favorite new group. It is identifi- 
able with the group of proper rotations of a tetrahedron. Here “proper”means it can be 


done by orientation-preserving 3-space rotation. See Figure 3.1 for a graph representing a 
tetrahedron that has been flattened. A 


Figure 3.1 Tetrahedron 
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We can identify elements of the group Tet of proper rotations of the tetrahedron with 
permutations in S, easily enough. But why must they be even? Note that any one of four 
faces of the tetrahedron can be placed on the ground. Then you can rotate in three ways. 
Thus the order of the group Tet looks like 12, that is, the same as the order of A,. So we 
have some evidence that Tet can be identified with A,. 


Exercise 3.1.14 Show that the two groups Tet and Ag are really the same by numbering the 
vertices of the tetrahedron and figuring out what permutation of the vertices corresponds to 
the various motions in Tet. 


Exercise 3.1.15 Work out the Cayley table for A,. Of course the program Group Explorer 
will do this for you but please show some of the computations here: that is, multiply out at 
least half of the products in the table using the disjoint cycle notation to make life easier. 
You could also use the presentation in Exercise 3.1.20 below. 


Many authors enlarge the tetrahedral group by allowing improper rotations. They thus 
get a group of order 24 which is Sq. 

As we said earlier, the tetrahedral group is the symmetry group of methane CH, - a main 
component of natural gas production and a big contributor to global warming. 

The following exercise shows that our very useful polynomial is actually a determinant - 
the Vandermonde determinant. Thus we could deduce Proposition 3.1.2 from properties of 
determinants to be found in Exercise 7.3.1. 


Exercise 3.1.16 Prove that our very useful polynomial from Definition 3.1.3 is the 
Vandermonde determinant defined by 


-1 n—2 
xt xi x, 1 
—1 n—2 
i t 1 
V(41,42,- ,T,) =det 7 
gt ee a | 


Then it is clear from properties of the determinant (see Section 7.3) that the transposition 
(ab), for a,b € {1,2,...,n}, switches rows a and b and thus multiplies the determinant by 
—1. Try the case n =3 first. 


Hint. Use induction on n. 
Step 1. Subtract x, - (the 2nd column) from the 1st column. 
Step 2. Subtract x, - (the 3rd column) from the 2nd column. 
Keep going until 
Step (n— 1). Subtract x, - (the nth column) from the (n — 1)th column. 
Then expand by minors of the last row and use the induction hypothesis. 


123 4 5 
5 4 1 2 3 
of disjoint cycles. Is a even or odd? What is the order of o? Find the inverse of o. 
Compute o?. 


Exercise 3.1.17 Consider the permutation o = ( ). Write o as a product 
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Exercise 3.1.18 State whether each of the following statements is true or false and give a 
brief reason for your answer. 


(a) The permutation 7 = (123)(321) has order 3. 
(b) The product of two odd permutations is odd. 


Exercise 3.1.19 Show that, for n>3, the group A, is generated by 3-cycles (abc). 


Hint. We know that S, is generated by 2-cycles. So we just need to think about prod- 
ucts of two 2-cycles. There are three cases to consider. (ab)(ab), (ab)(ac), (ab)(cd), where 
a,b, c,d are pairwise distinct integers from {1,...,n}. For the last case, look at: (ab) (bc) 
(bc) (cd). 


Exercise 3.1.20 Show that the tetrahedral group A, has generators R=(234) and F= 


(12)(34), with relations R? = F* = (FR)’ =I. 


3.2 Isomorphisms and Cayley’s Theorem 


At last we define what we mean when we say that two groups are mathematically the same 
or isomorphic. 


Definition 3.2.1 Suppose that G and G’ are groups. A function T:G—G’ is called a 


group isomorphism iff T is 1-1 and onto and T preserves the group operations, that is, 
T(xy) = T(x)T(y). Then we say that groups G and G' are isomorphic and write GX G’. 


Example 1. Any cyclic group of order n is isomorphic to the group of integers mod n under 
addition mod n. We have already realized this, while using the finite analog of the logarithm, 
first to investigate whether elements of cyclic groups can be squares and then in proving 
the fifth fact about powers in cyclic groups in Section 2.5. To prove that C, = Z, - now that 
we have a proper definition of isomorphic groups - suppose the cyclic group C, = (a) = 
{a |x€Z}. Define the function T: Z, + CG, by T(x mod n) =a". The map is well defined 
and 1-1 by the first fact about powers in cyclic groups in Section 2.5. The map is onto either 
by the pigeonhole principle or by noting that every element of the cyclic group (a) is a 
power of a. To see that the function preserves the group operation, use Proposition 2.4.1 to 
show that 


T(x+y (mod n)) = a** = a'@ =T(x)T(y). 


Of course, here the first group has the operation of addition and the second group has 
multiplication as its operation. Nevertheless we can now identify the two groups as far as 
this book is concerned. We can now translate any question about a cyclic group into a 
question about the integers mod n under addition. We have our finite log table. A 


Example 2. Two groups of the same order need not be isomorphic. For example, the sym- 
metric group $3 is not isomorphic to Z¢ with the operation of addition mod 6. Any group 
isomorphic to a commutative group would have to be commutative. A 
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Proposition 3.2.1 (Facts About Group Isomorphisms). Suppose that G and G’ are groups 
and the mapping T:G— G’ is a group isomorphism from G onto G’. Then we have the 
following facts. 


(1) If e is the identity of G and eé’ is the identity of G’, then T(e) =e’. 
(2) For all xe G, we have T(x~!) =T(x)7!. 
(3) The order of T(x) € G is the same as the order of x EG. 


Proof. 


(1) We leave the proof of part (1) as an exercise. 
(2) The equalities 


e = Te) =Tar) =Ta)ir*) 


and 


imply the desired result. 
Suppose n= |T(x)| and d=|z| are both finite. Then, using fact 3 about powers in finite 
cyclic groups from Section 2.5, 


—_ 
bo 
~S 


implies +” =e, as T is 1-1. Thus d= |x| divides n= |T(x)|. To go the other way, we can 
use the fact that T~! is also a group isomorphism by Exercise 3.2.5. So we see in the 
same way that that n divides d. Why can we do this? It follows that n= d. What happens 
if x has infinite order? A 


Exercise 3.2.1 Prove part (1) of Proposition 3.2.1. 

Exercise 3.2.2 Answer the questions in the proof of part (3) of Proposition 3.2.1. 
Exercise 3.2.3 Show that Zs under addition is isomorphic to Z} under multiplication. 
Exercise 3.2.4 Show that finite isomorphic groups must have the same order. 


Exercise 3.2.5 


(a) Show that if T:G—+H is a group isomorphism then T~!:H->G is also a group 
isomorphism. 

(b) Suppose T: G— H is a group isomorphism and S: H— K is also a group isomorphism. 
Show that the composition So T:G— K is a group isomorphism. 


Exercise 3.2.6 Show that if G and H are isomorphic groups, then G commutative implies 
H is commutative also. 


Exercise 3.2.7 Consider the group D3. Show that D3 is isomorphic to a subgroup of S¢ as 
follows. Denote the elements of D3 as {g1,92,...,9¢}. Then consider the map taking an 


Groups: There’s More 


element a € D; to the permutation corresponding to row a of the Cayley table of D3. This is 
the permutation o = a(a) given by 


( 91 92 93 ga 95 96 ) 
491 = Jo(1) 492 =Jo(2) 493 =Jo(3) €4=Go(4) 995 =9Jo(5) 496 = Jo(c6) 


Write down the permutation o(a) explicitly for each element a of D3. 


The preceding exercise leads to the following theorem when generalized to any finite 
group. 


Theorem 3.2.1 (Cayley’s Theorem). If G is a finite group of order n, then G is isomorphic 
to a subgroup of Sy, the group of permutations of n objects. 


Proof. For any gé€ G, recall (from Definition 2.3.2) that we have the left multiplication map 
L, of G defined by setting L,(x) = gx, for all xe G. See Definition 2.3.2. We showed in 
Section 2.3 that L, is 1-1 and onto and thus permutes the elements of G. 

We can enumerate G={g1,..., gn} and write La(gi) = agi = gz, i), for a unique o¢ € Sn. 
Then define T:G—S, by T(a) =0,. We need to show that T is well defined, 1-1, and 
preserves the group operations. Certainly Tis well defined. Suppose og = oy. Then if gi =e, 
the identity of G, we have a=g,,.(1) = 9o,(1) =. So Tis 1-1. Next consider 


Jou i) = 409i = 4(b9:) = 4(95,(i)) = Goa (or(i))* 


It follows that T(ab) = 04) =o, © 0, = T(a) © T(b). 
The theorem is proved once you do the exercise below. A 


Exercise 3.2.8 Show that if T is the function defined in the proof of Theorem 3.2.1, then 
the image T(G) = {T(g) | g € G} is a subgroup of Sy. 


Hint. You can use the finite subgroup test (i.e., Proposition 2.4.4) here. 


Cayley’s theorem does not say that the group G of order n is isomorphic to S,. This is 
impossible - unless n=1. For S, has order n! while G has order only n. The symmetric 
group is huge compared with G, even for n=5 when n! = 120. Under the isomorphism of 
Cayley’s theorem, the group S, is isomorphic to a subgroup of S,, which is a group of order 
720. In general, Cayley’s theorem realizes S, as a subgroup of the much larger group S,). 
You could then use Cayley’s theorem again to view both groups as subgroups of S(,1);. And 
you could keep going with that somewhat insane idea. 


Exercise 3.2.9 Consider the group Zs of integers mod 5 under addition. Enumerate the 

elements as {{1],[2], [3], [4], [5]}. Find the permutations in Ss coming from the proof of 
123 4 5 

4 5 1 2 ;) 

(14253). Note that the fact that we got a 5-cycle is not a shock as the element in Ss; must 

have the same order as that in Z, to which it corresponds (by part (3) of Proposition 3.2.1). 


Cayley’s theorem. For example, o3(([x]) =[3 + x (mod 5)]. Then o3= ( 
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Exercise 3.2.10 Show that the group S3 is isomorphic to the following group of matrices 
under the operation of matrix multiplication as defined in formula (1.3) of Section 1.8 and 
the statement that follows it: 


10 0\ /1 
G=<{o 1 O},{0 
0 0 17 \o 


Exercise 3.2.11 Show that isomorphism between groups is an equivalence relation. 


Definition 3.2.2 If G is a group and T: G—G is a group isomorphism, we say that T 
is a group automorphism. The set of all automorphisms of G is called Aut(G) and it 
forms a group under composition of functions. 


Definition 3.2.3 Take a fixed element g€ G. Define the conjugation function C,(x) = 
g- ‘xg for all xe G. If y=g-'xg for some g€G, we say that x and y are conjugate 
elements of G. If we have two groups G and H such that H=x~'Gx, for some x ina 
group containing G and H, we say that G and H are conjugate groups. 


Exercise 3.2.12 Show that conjugation is a group automorphism. 


Of course, if the group is Abelian, conjugation is not very interesting because it is the 
identity map taking x to x. We call the automorphism C, an inner automorphism. The rest 
of the automorphisms of G are called outer automorphisms. 

Note that this idea of conjugate is quite different from that of complex conjugate. That 
is a horse of a different color which goes under the heading of field conjugation - really 
an element of a Galois group, that of the field automorphisms of C fixing elements of R. 
Such things will be discussed in Section 7.6 for finite fields. 

It is often useful to glomp together conjugate elements of a group. This leads to the 
following definition. 


Definition 3.2.4 The conjugacy class of an element x of the group G is {x}= 


{g- ‘xg | g€ G}. 


Exercise 3.2.13 Show that there is an equivalence relation on a group G obtained by saying 
x,y € G are equivalent iff x and y are conjugate. Then show that the equivalence classes for 
this equivalence relation are the conjugacy classes. 


Exercise 3.2.14 Find all the conjugacy classes in A. 


Hint. Look at cycles and note that o(a,a)---an)o~! =(o(a1)o0(a2)--+o(an)), for any 
permutation o € Sy. 
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Exercise 3.2.15 Find all the conjugacy classes in GL(2,C), the group of 2 x 2 complex 
matrices with nonzero determinant. Conjugate matrices were well studied in your lin- 
ear algebra book. They correspond to matrices that give the same linear mapping of C? 
but with respect to different bases of C?. Such matrices are called “similar” in linear 
algebra. The rational canonical form of a matrix (as well as the Jordan form) repre- 
sent the classes of similar matrices. See Section 7.2, Dornhoff and Hohn [25], or Strang 
[115, Chapter 5]. 


Exercise 3.2.16 Find four subgroups of the symmetric group S4 that are isomorphic to S3. 


Exercise 3.2.17 State whether each of the following statements is true or false and give a 
brief reason for your answer. 


(a) The groups S4 and Dy,» are isomorphic. 

(b) Define f:Z —Z by f(x) = 2x, Vxe Z. Consider Z to be a group under addition. Then f is 
a group isomorphism. 

(c) Consider the integers Z and the rationals Q as groups under addition. These groups are 
isomorphic. 

(d) All infinite groups are isomorphic. 


Exercise 3.2.18 Show that the group Aut(Z,) of automorphisms of Zn under addition is 
isomorphic to the group Z*. This is a case in which all but the trivial automorphism are 
outer automorphisms. 


Hint. Consider the map sending o € Aut(Zn) to o(1). 


Exercise 3.2.19 Show that the group R of real numbers under addition is isomorphic to the 
group Rt of positive real numbers under multiplication. 


Hint. Make use of e* and log x. 


Exercise 3.2.20 Which of the following groups are isomorphic? 


(a) the group of symmetries of a square; 
(b) the group of symmetries of a rectangle; 
(c) the multiplicative group Z*,; 

(d) Za under addition; 

(e) the Klein 4-group from Exercise 2.2.1. 


3.3 Cosets, Lagrange’s Theorem, and Normal Subgroups 


In this section we want to generalize the concept of Z,, — the integers mod n. The elements 
of Z, are sets of the form a+ nZ. We call such sets cosets. Since the group Z is Abelian 
it does not matter whether a is on the right or left of nZ. When a group is not Abelian, it 
does matter. In any case, the point is that we want to view the coset as an object in its own 
right by equating everything in a given coset, just as we did for Z,. 
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Definition 3.3.1 If H is a subgroup of the group G, we say that a left coset of H is a set 


of the form gH = {gh | he H}. One can similarly define a right coset of H to be a set Hg. 
We denote by G/H the set of all left cosets and H\G the set of right cosets. 


The left coset gH in the definition is the image of H under the left multiplication map 
L,: G— G defined by L,x= gx, for all x € G. See Definition 2.3.2. This is why we call gH a 
“left” coset and not a “right” coset. 


Example. Take the group G=Z under addition and the subgroup H= nZ= {nq|q € Z}. We 
view the elements of Z,, as cosets: 


[aj =a+nZ= {a+ngq |qeZ}. 


As we said, because this particular group is Abelian, left cosets are the same as right 
cosets. A 


Proposition 3.3.1 (Facts About Cosets). Suppose H is a subgroup of the finite group G. Then 

we have the following facts. Fact (2) holds even if G and H are not finite. 

(1) |H| =|Hg| =|9H| for all g €G. 

(2) We have an equivalence relation on G by defining x~y iff x—!y € H. Then the equiv- 
alence classes are the left cosets which partition G into a disjoint union of left 


cosets. iG 
(3) |G/H| = _.. 
|H| 

Proof. 


(1) The left multiplication map from Definition 2.3.2 is L,(x) = ga, for all x G. This map 
takes H 1-1 onto the left coset gH. Thus |H| =|gH|. 

(2) Ifx~'y=h EH, then y=.h € +H, and conversely. To see that this relation is an equiva- 
lence relation, we just need to prove that it is reflexive, symmetric, and transitive. The 
proofs are as follows. 

(a) Reflexiver~x: x7-!x=e EH. 

(b) Symmetric x~y=> y~x:a7!y=he Himplies h-1=y'x EH. 

(c) Transitive r~y and ywz—>x~z: x 'y=hEH and y-'z=kEH implies hk= 
x yy ‘z=a "ZEN. 

It follows that equivalence classes for this equivalence relation are the left cosets. 

(3) This follows from (1) and (2) since G is a disjoint union of |G/H| left cosets each of 
which has the same number |H| of elements. Thus |G| = |G/H| |H]. A 


Example 1. Take the group G=Z, under addition along with the subgroup H= 
(2 (mod 6)) = {2, 4,0 (mod 6)}. The cosets of H are H and 1+ H={1,3,5 (mod 6)}. We 
see that each coset has three elements, the cosets are disjoint, and every element of Gis in 
some coset, just as Proposition 3.3.1 predicted. A 


Example 2. Let G= S$; and H=((123)) = {(1), (123), (132)}. A computation shows that 
the coset (12)H = {(12), (23), (13)}. Thus Gis a disjoint union of H and (12)H. This is just 
a special case of part (2) of the preceding proposition. A 
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The preceding facts about cosets lead quickly to Lagrange’s theorem. 


Theorem 3.3.1 (Lagrange’s Theorem). Suppose that H is a subgroup of a finite group G. 
Then |H| divides |G}. 


Proof. This follows from part (3) of the preceding proposition. A 


Instead of calling this Lagrange’s theorem, we should perhaps have called it Lagrange’s 
corollary - given the length of the proof. However, it is important - especially for 
Section 4.5 - and I guess that justifies calling it a theorem. 


Corollary 3.3.1 


(1) If G is a finite group, the order of any element of G must divide the order of G. 
(2) Any group G of prime order is cyclic. 
(3) (Fermat’s Little Theorem). If a € Zp, where p is prime, then a? =a (mod p). 


Proof. 


(1) We know that the order of the cyclic group generated by a, namely (a), is the same as 
the order of the element a of G by fact 2 in Section 2.5. The result thus follows from 
Lagrange’s theorem. 

(2) A prime p has no positive divisors but p and 1. Thus from part (1) if |G| =p, then the 
order of any element a£e of G must be p. So (a) =G. 

(3) If a=0 (mod p) the result is clear. Otherwise a (mod p) is in the group Z> which has 
order p — 1. This implies (by part (1)) that a?~!=1 (mod p). Multiply the congruence 
by a to obtain the result. A 


Exercise 3.3.1 Find all the subgroups of the multiplicative groups Z7,Z3, Z5, Zio. 


Exercise 3.3.2 Find all left cosets of the subgroup H= {1,11 (mod 20)} in the multiplicative 
group Z5,. 


Exercise 3.3.3 List all the subgroups of the tetrahedral group A4. Then draw the poset 
diagram for the subgroups of A4. Note that A, has no subgroup of order 6 and thus that 
the converse of Lagrange’s theorem is false. 


Hint. Group Explorer will do this exercise for you but you should explain why your list is 
complete. The program will even arrange the multiplication table according to cosets of a 
subgroup. 


Definition 3.3.2 A subgroup H of the group G is called a normal subgroup iff g~ '|Hg = H 


for all g€G. The equality g~'Hg =H means H= {g~'hg| he H}. 
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It is clear that every subgroup of an Abelian group is normal. In the next section we shall 
see why normal subgroups are nice — they are the only type of subgroups that allow us to 
make the set of cosets gH into a group G/H using the same method that worked for Z,. 

The use of the word “normal” is not intended to imply that most subgroups are normal 
or that such normal subgroups are particularly ordinary. As usual in mathematics, the only 
meaning the word “normal” has is contained in its definition, Definition 3.3.2 here. 


Exercise 3.3.4 Show that a subgroup H of G is normal iff gH = Hg for all g €G, that is, iff 
every right coset is a left coset. 


Exercise 3.3.5 Find all normal subgroups of $3. 


Exercise 3.3.6 Are the following statements true or false? 

(a) Z7 is a normal subgroup of Zy4. 

(b) (F) is a normal subgroup of D3, using the notation of Section 2.1. 
(c) (R) is a normal subgroup of D3, using the notation of Section 2.1. 


Exercise 3.3.7 Show that if H is a subgroup of group G and |G/H| = 2, then H is a normal 
subgroup of G. This shows that the alternating group A, is a normal subgroup of the 
symmetric group Sy. 


Example. Consider the multiplication table for the dihedral group D3 arranged according 
to left cosets of the subgroup (R) - using the notation of Section 2.1. Recall that RF = FR’. 


Multiplication table for D3 
I R R F FR | FR? 


There are graphs associated to the cosets G/H and a generating set S of G. These are 
called Schreier graphs. The vertices are the cosets gH, g € G. Draw an arrow from coset 
gH to coset sgH for each s € S. There are many examples of these graphs in Terras [116] or 
Terras [117]. Schreier graphs can have loops and multiple edges. A 


Example. Let G= Ky, the Klein 4-group. Recall the multiplication table in Figure 2.14. Let 
H= {e,h}. Find G/H and draw the corresponding Schreier graph for the edge set S= {h,v}. 
To create the Schreier graph you take the Cayley graph X(K,, S) and glomp together vertices 
which are in the same H-coset. These graphs will be undirected since xe S implies x~! €S. 
See Figure 3.2, which shows the Cayley graph X(K4, 5S) on the left and the Schreier graph 
for G/H on the right. A 


Exercise 3.3.8 Show that an element g of the group G defines a function L,(xH) = gxH, 
for all xH € G/H. Then show that the function L,: G/H—+ G/H is 1-1 and onto. Moreover 
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Figure 3.2 On the left is the Cayley 
graph for the Klein 4-group 


G = {e,h, v, hv= vh} G/H, . j 
= Klein 4-group H= {ch} K, = {e, h, v, hv}, with generating set 
S={h, v} using the notation of 
al A Figure 2.14. On the right is the 
Schreier graph for K4/H, where 
H H= {e, h}, with the same set 
S={h, v} 
vH 
v hv = vh 


S = {h, v} edge set 


show that Lgg = Lg ° Ly. We view L, as defining a group action of the group G on G/H -a 
concept we will consider in Section 3.7. 


Exercise 3.3.9 Draw the Schreier graph for the quotient G/H with 


G=aan(s)= { ¢ ') a,b EZs, ooh, 


na{|(5 *) vex}, anas={(2 °).(51 1)} 


Exercise 3.3.10 Just as in Example 3 of Section 2.3, consider the group G of non-singular 
3 x 3 matrices with entries in the integers mod 2. Draw the Schreier graph for the quotient 
G/H with G= GL(3,Z2), 


1 * x 
H=the subgroup of matrices of the form |}O * x 
O * x 


d 


where * means the entry can be either O or 1, and generating set 
0 1 1 1 0 0 
S= Oo 1 07,;0 0 1 
1 0 0 Oo 11 


Change the subgroup H to the subgroup consisting of transposes of the matrices in H. The 
transpose of a matrix M=(m,;) is defined to be the matrix 'M=(m,;). This produces a 
Schreier graph whose adjacency matrix has the same eigenvalues as the original. Recall 
that the eigenvalues \€ C of matrix A are the solutions of det(A — XI) =0. But the two 
graphs are not isomorphic graphs (meaning not related by a 1-1,onto map between vertices 
preserving adjacency of vertices). See Terras [116] or Terras [117]. 
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The graphs in the preceding exercise helped to answer a question posed by Kac [51] - 
“Can you hear the shape of a drum?” This means can you distinguish two drums by their 
sounds (i.e., fundamental frequencies of vibration). Carolyn Gordon, David Webb, and Scott 
Wolpert used the same basic construction in 1992 to show that there are (non-convex) pla- 
nar drums that cannot be heard. This means that the fundamental frequencies of vibration 
- eigenvalues of the Laplace operator - are the same for the two drums, although one drum 
cannot be turned into the other by a rigid motion. 

It is also possible to think about double cosets of G modulo two subgroups H and K. These 
are sets HgK = {hgk|h € H, k€ K}. The set of all such double cosets is denoted H\G/K. The 
idea finds many applications in number theory and group representation theory. Once more 
the double cosets partition G into a disjoint union, but the double cosets may have different 
sizes. Double cosets are lurking behind Figure 0.4 in the preface. 


Example. Let G= $3, H=((12)),K=((13)). Find the double cosets H\G/K. Then one 
double coset is HK = {(1), (12), (13), (12)(13) = (132)}, while the other double coset is 
{(23), (12)(23) = (123)}. A 


Exercise 3.3.11 Show that given a group G and two subgroups H and K, we can define an 
equivalence relation on G by saying x,y €G are equivalent and writing y~ x iff there are 


elements he H,k€ K such that y=hxk. What are the equivalence classes? 


Exercise 3.3.12 Draw the Schreier graph for G=S3 and H= ((12)). Then do the same for 
G=S,, H=((12)). 


Exercise 3.3.13 Find a normal subgroup of the affine group Aff(7): 


Aff(7) -{ (¢ ) 


The operation is matrix multiplication. 


a,bEZ7, aol 


3.4 Building New Groups from Old, |: Quotient or Factor Groups G/H 


We want to imitate the construction of the integers modulo n. Suppose that His a subgroup 
of the group G. Recall that a left coset gH is the set of all gh, for h € H. Take G=Z under 
addition and the subgroup H= nZ, consisting of all multiples of n. Then we have already 
made use of the cosets: 


(o)=H, [1J=1+H, ..., [n—-1]=(n—-1) 44. 


These are the elements of Z,. We were able to add integers mod n by writing [a] + [b] = 
[a + b]. We had to show that this makes sense: that is, 


[a] = [a’] , [b] =[b’] implies [a + b] =[a’ +b’). 


It is not possible to do the analog for any subgroup H of a group G. We will need to 
restrict our consideration to normal subgroups H of G - as in Definition 3.3.2. Abnormal 
subgroups need not apply for this construction job. Of course every subgroup of an Abelian 
group is normal. So let us consider a non-Abelian example. 
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Example. Consider the group S, of permutations of n objects. Which subgroups H of S, 
are normal? To answer this question, consider the following equation for an arbitrary cycle 
(a, a,---a,) of length r: 


7 0 (aya,-+-a,)00-!=(0 (a) 0 (a,)-++0(a,)), YOE Sy 


The equation is valid since the permutations only affect the elements of {1, 2,...,} of the 
form o(a;) for some j=1,2,...,17. The element o(a;) is sent to o(a;,,) by both sides for 
j=1,...,r— 1. The element o(a,) is sent to o(a1). 


This shows that if H is a normal subgroup of S, and contains a cycle of length 7, then H 
must contain all other cycles of length r. This means that the only normal subgroups of S3 
are the trivial subgroup {(1)}, the cyclic subgroup of order 3: ((123)) = {(1), (123), (132)}, 
and S; itself. A 


Exercise 3.4.1 Find all the normal subgroups of D4. 
Exercise 3.4.2 Find all the normal subgroups of Ag. 
Exercise 3.4.3 Let our group be the affine group 


a= amis) = {| (¢ ') a, be Zs, poh. 


The operation is matrix multiplication. Find a normal subgroup of order 5 and then another 
normal subgroup of order 10. Prove that the two subgroups are indeed normal. 


Now we need to multiply the cosets of a normal subgroup. We imitate what worked for 
the subgroup nZ of Z in Sections 1.6 and 2.3. 


Definition 3.4.1 [f H is a normal subgroup of the group G, define the quotient (or fac- 


tor) group G/H to be the set G/H of left cosets with multiplication (aH) (bH) = abH, 
Va,beG. 


Theorem 3.4.1 If G is a group with normal subgroup H, then using the Definition 3.4.1, 
G/H is a group. 


Proof. Defining the multiplication of subsets S and T of G by: 
ST={ay|xeS,yeT}, (3.2) 


we can consider (aH) (bH) as the product of two cosets. This is of course independent of the 
representatives a and b which were chosen. Then since His normal, (aH) (bH) = a(Hb) H= 
abHH = abH, since HH = H. To see that G/H is a group, we need to find an identity ele- 
ment. That identity is H since aHH= aH and HaH= aHH= aH. For any subset S C G, define 
S-'={x"'|xeS}. Then (aH)~"'=H-'a~'=Ha~'=a~'H and (a~'H) (aH) =(Ha"') 
(aH) = H. The associative law is hopefully clear. A 
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Example. Recall the multiplication table of the example in the preceding section with G= 
D;, H= (R), using the notation of Section 2.1. 


Multiplication table for D3 
z R |R ]F [FR | FR 


Then we see that, since that multiplication table glomps the group elements according 
to cosets, the group D;/(R) has the standard multiplication table for a group with two 
elements. The only coset is F (R). So we get the table shown. Of course you can see this 
easily directly since F(R) =(R) F, and P = 1 


Multiplication table for 


D3/ (R) 

[ - I(R) | F(R) 

| 1(R) |] (Ry | F(R) 

| F(R) |] F(R) | 1(R) A 


Exercise 3.4.4 Imitate the computations of the preceding example for Aff(5)/H, where H 
denotes the normal subgroup of order 5 found in Exercise 3.4.3. 


Exercise 3.4.5 Imitate the computations of the preceding example for D,/(R*). Can you 
identify the group D,/(R?) with one of the groups considered earlier? 


Exercise 3.4.6 Suppose that H is a normal subgroup of G. Using the definition of set mul- 
tiplication in equation (3.2), prove the following equalities which were used in our proof of 
Theorem 3.4.1. 

(a) HH=H. 

(b) (aH) (bH) = a (Hb) H. 

(c) H!={h-'|he H} =H. 


Exercise 3.4.7 Show that the center Z(G) is a normal subgroup of the group G. 


Exercise 3.4.8 Show that the intersection of two normal subgroups of G is a normal subgroup 
of G. 


Definition 3.4.2 A simple group G is such that the only normal subgroups H of G are 


H={e} and H=G. 
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It turns out that the alternating groups A, are simple for n>5. Many people 
have written thousands of pages on the classification of finite simple groups. Many 
finite simple groups are in infinite families of groups like A,. Others are the so- 
called sporadic groups. The largest of those is called the monster group - with 
order 808 017 424 794 512 875 886 459 904 961 710 757 005 754 368 000 000 000. The choice 
of the word “simple” here is misleading, as cyclic groups with any non-trivial sub- 
groups are certainly not simple groups, yet in many ways they are simpler than simple 
groups. For more information on the classification of finite simple groups see Wikipedia or 
Wilson [127]. 

The fact that As is simple, plus a little Galois theory, implies that it is impossible to solve 
the general quintic equation over Q by repeated radicals. This can of course be done for 
quadratic equations - recalling the quadratic formula from junior high school or middle 
school. It can also be done for cubic and quartic equations. But once you have an equa- 
tion involving a polynomial of degree 5 over the rationals, if you can find that its Galois 
group over Q is S; or As, then you know that formulas generalizing the quadratic formula 
involving higher order radicals than the square root will not suffice to find formulas for the 
roots of the polynomial. See Dummit and Foote [28, Chapter 14] or Gallian [33, Chapter 
32] for more information on this subject. Dummit and Foote give the example of the quintic 
x —x+ 1 over Q. We will not touch this subject since we will consider Galois theory only 
for finite fields in Section 7.6. 


Exercise 3.4.9 Show that a quotient G/H is cyclic if G is cyclic. 
Exercise 3.4.10 Which Abelian groups are simple? 


Exercise 3.4.11 If G is a group of order 15, show that G has an element of order 3. 


Hint. There is no problem if G is cyclic - thanks to theorems in Section 2.5. What does 
Lagrange tell you about possible orders of elements of G? 


We will have more to say about groups of order < 15 in Section 4.5. 

Suppose you are given a group G and you hate G because it is not Abelian. You might 
want to make an Abelian group related to G. To do this one forms the commutator subgroup 
G’, which is the subgroup of G generated by elements of the form aba~!b—', for a,b EG. 


Exercise 3.4.12 Show that the commutator subgroup G’ is a normal subgroup of G and then 
show that the group G/G’ is Abelian. 


Exercise 3.4.13 Suppose that H is a subgroup of G but H is not a normal subgroup of G. 
Show that the multiplication of cosets given by aHbH = abH, for a, b€ G, does not turn G/H 
into a group. 


Exercise 3.4.14 Suppose that H is a normal subgroup of the finite group G. If G/H has an 
element of order n, show that G has an element of order n. 


Hint. First show that the order r of the coset gH in G/H divides the order of g€ G. Then 
look at g’. 


Exercise 3.4.15 Show that A, is not a simple group. 
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3.5 Group Homomorphism 


We have already defined group isomorphisms. Homomorphisms are similar except they 
need not be 1-1 and onto. The idea comes from C. Jordan in 1870. It is an extremely useful 


idea. Once you have seen the definition, you will find group homomorphisms all over this 
book. 


Definition 3.5.1 If G and G' are groups, a function T:G— G' is a group homomorphism 


iff T(xy) = T(x)T(y) for all x,y € G. 


Example 1. Let G=Z under addition and G’=Z, under addition mod n for n>1. 
Define T(x) =x (mod n). Then T:Z—Z, is a group homomorphism which is not an 
isomorphism. A 


Example 2. Let G=S,, for n>2 and consider {+1} as a group under multiplica- 
tion. The map sgn: S, + {+1} defined by equation (3.1) is a group homomorphism by 
Proposition 3.1.2. It cannot be 1-1 unless n=2. A 


Example 3. Let R be the group of real numbers under addition and let C be the group of 
complex numbers under addition: that is, C= {x + iy | x, ye R}, where i= /—T. If x + iy 
and u+ ive€C, with x, y, u, ve R, we define the sum by 


(x + iy) + (ut iv) = (4+ u) +i(y+>). 


We have a group homomorphism T: R > C defined by T(x) = +. It is 1-1 but not onto as 
igR. A 


Exercise 3.5.1 Show that the familiar exponential function ¢:R— R* defined by setting 
g(x) =e", for every real number x, is a group homomorphism if we consider R as a group 
under addition and R* as a group under multiplication. Is ¢ an isomorphism? 


Now we define an important subgroup associated to a group homomorphism. 


Definition 3.5.2 Suppose G,G’ are groups and T:G—>G’' is a group homomorphism. 


The kernel of T is ker T= {g€ G|T(g) =e}, where e’ is the identity of G’. 


What are the kernels of the homomorphisms in the three examples above? 


Example 1. The kernel is nZ. A 
Example 2. The kernel is the alternating group A,. A 
Example 3. The kernel is {0}. A 


The following lemma should be familiar from Proposition 3.2.1. The lack of injectivity 
(or 1-1 ness) does not hamper us in proving these basic facts. 
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Lemma 3.5.1 Suppose G,G' are groups and T:G-—+G’' is a group homomorphism. If e is 
the identity of G and e’ is the identity of G’, then T(e) =e and T(x~!) =T(x)7}. 


Proof. First note that T(e)T(x) = T(ex) = T(x) = e’T(x) implies by the cancellation law that 
T(e) =e’. 
Second note that T(x~')T(x) = T(x~'x) =T(e) =. Similarly T(x)T(x~') =. A 


Lemma 3.5.2 The group homomorphism T: G— G’ is an isomorphism iff it is onto and the 
kernel is the trivial subgroup; that is, ker T= {e}, where e is the identity of G. 


Proof. Using Lemma 3.5.1, we need to show that Tis 1-1 iff ker T= {e}. Suppose Tis 1-1. 
Then clearly ker T= {e}. Conversely, suppose ker T={e}. Then Tx= Ty implies T(xy~') = 
T(x) T(y)~ |= e’, the identity of G’. But then xry—! € ker T= {e} and xy ' =e, which implies 
x=y. Thus T is 1-1. A 


Exercise 3.5.2 Suppose that G,G are finite groups and T:G—G’ is a group homomor- 
phism. Show that the order |T(g)| divides the order |g|, for all g €G. 


Example 1. Let G=Z under addition and G’ = Z, under addition mod n. Define T(x) = 
x (mod n). The kernel of T is the additive subgroup nZ. A 


Example 2. Consider the group R? of vectors (") , for x,y real. This is a group under com- 


ponentwise addition: (‘) + (‘) = (; a. Given a 2 x 2 matrix M= (: 
y v you c 


a group homomorphism Ly: R? > R* defined by 


L.{*\— (4 b\ (x\  fax+by 
M\y)~ \e d) \y) \er+ay/’ 
The kernel of Ly is called the nullspace of the matrix M (or the linear function Ly) in linear 
algebra. See Definition 7.2.3. A 


. we have 


Exercise 3.5.3 Show that Ly in example 2 is indeed a group homomorphism. What is its 
kernel? 


Lemma 3.5.3 If G, G’ are groups and T:G— G is a group homomorphism, then ker T is a 
normal subgroup of G. 


Proof. First let us show that ker T is a subgroup of G. We use the one-step subgroup test 
from Proposition 2.4.3. Suppose that x, y€ ker T. This means that if e’ is the identity of G’, 
T(x) = T(y) =e. But then T(xy—') = T(x) T(y)~' =e. So xy! Eker T. 

So now we need only show that a~!(ker T)a=ker T, for all ac G. If xe ker T, look 
at T(a~!xa) = T(a)~!T(x)T(a) = T(a)~!eT(a) =e’. This means that a~!(ker T)aCker T. 
Multiply the set-theoretic equality by a on the left and a~! on the right to get ker TC 
a(ker T) a~’. Why is this legal? Then replace a by a~'! to see that ker TC a7! (ker T) a. 
This finishes the proof that a~!(ker T)a= ker T. A 
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Exercise 3.5.4 Answer the “why?” in the middle of the second paragraph of the proof of 
the last lemma. 


Example 3. We define a homomorphism /f taking the additive group R of real numbers to 
the multiplicative group T of complex numbers of absolute value 1, T= {ze C | |z|=1}. 
The homomorphism is defined by f: RT, f(x) = 27" =cos(27x) + isin(27x) (Euler’s 
identity). Here i= /—1. Note that f(x + y) =/{x)f{y). Question: What is ker f? Answer: 
ker f= Z. You see this from the diagram showing the location of the point e"" = cos(27x) + 
isin(27x) in the plane. What are the polar coordinates of this point? The radius is 1 and 
the angle is 27x. A 


Example 4. Suppose that F={F:R—R} and 
D={F:R—-R|f is everywhere differentiable}. 


Both F and D are groups under pointwise addition - defined by (f+ g)(x) =f(x) + 
g(x), VxE R. Define T: D > F by Thx) =f (x) = The mapping T is a group homomor- 
phism since the derivative of a sum is the sum of the derivatives (as shown in calculus 
or advanced calculus since calculus classes seem to be proofless). Question: What is the 
kernel of T? Answer: the functions on the real line with O derivative everywhere, that is, 
the constant functions (using the mean value theorem). A 


Exercise 3.5.5 Show that the determinant gives a group homomorphism det: GL(n, R) > R*, 
where R* =R — {0} under multiplication. Here, as in Example 3 of Section 2.3, GL(n, R) 
is the general linear group of non-singular or invertible n x n real matrices under matrix 
multiplication defined as in formula (1.3) of Section 1.8 and the statement that follows 
it. Note that we will consider determinants in Section 7.3. You should also check that the 
general linear group is indeed a group. What is the identity? 


The following theorem goes back to C. Jordan in 1870. Emmy Noether proved more 
general versions of this and two other isomorphism theorems in 1927. See Exercises 3.5.9 
and 3.5.13. 

First we need a definition. 


Definition 3.5.3 Suppose that G, G’ are groups and f: GG’ is a group homomorphism. 
The image group of G under f is 


AG) = tf) | re G}. 


It is an exercise to show that f(G) is a subgroup of G’, under the hypotheses of the 
definition. See Exercise 3.5.7. 


Theorem 3.5.1 (First Isomorphism Theorem). Suppose that G,G’ are groups and f: 
G->+G’ is a group homomorphism. Then the quotient group G/ker f is isomorphic to 
the image group f(G) from Definition 3.5.3 under the mapping 


F: G/ker f>f(G), 
defined by F(g ker f) = fig), Vg €G. 
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Proof. For simplicity, let K=ker f. 

F is well defined. This means aK = bK implies F(aK) = F(bK). If aK = bK, then there is an 
x€K such that b =ax. Thus, using the definition of K=ker f, F(bK) = F(axK) = f(ax) = 
Sf(a)f(x) =f(a)e =f (a) = F(aK). Here e’ is the identity in G’. 

F preserves group operations. This means F[(aK)(bK)] = F(aK)F(bK). Using our defini- 
tion of multiplication of cosets in G/K, the definition of F, and the fact that fis a group 
homomorphism, we have F[(aK)(bK)| = F(abK) =f (ab) = f(a) f(b) = F(aK)F(bK). 

F is one-to-one. This means F(aK) = F(bK) implies aK = bK. By definition F(aK) = F(bK) 
implies f(a) = f(b), which implies (using the fact that fis a group homomorphism) that 
f(a~'b) =e’, the identity in G’. Thus, by the definition of the kernel, a~'b€ K=ker f- So 
a~'b=x € K. It follows that b= ax and therefore that bK = axK =aK, since +K = K for any 
xin K, as K is a subgroup. Thus F is 1-1. A 


Exercise 3.5.6 Show that if H is a normal subgroup of G, then the map 7:G— G/H 
defined by 7(g) = gH is an onto group homomorphism. This map is often called the natural 
projection of G onto G/H. 


Exercise 3.5.7 Show that, under the assumptions of Definition 3.5.3, the image group f (G) 
does indeed form a subgroup of G’. 


Example 1. Recall that Z», is the factor group Z/mZ under addition mod m. We have a 
homomorphism (an example of the natural projection in Exercise 3.5.6) f:Z— Zm, given 
by f (x) =x+ mZ, for all x€Z. What is ker f? It is the subgroup mzZ. In this case the first 
isomorphism theorem just says Z/mZ= Z,. This is not anything new. Visually, to get Zn 
out of Z, you roll up the infinite discrete line of integers into a finite circle Z/mZ by 
identifying integers if their difference is a multiple of m. See Figure 3.3. A 


—-oeo0_e_0_0 ee 


Figure 3.3 Roll up Z to get Z/nZ 


Example 2. Consider the example again of a homomorphism taking the additive group R 
of real numbers to the multiplicative group T of complex numbers of absolute value 1, 
defined by 


f(x) = 2" = cos(2rx) + isin(27x). 
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Here i= /—1. We can identify T with the unit circle using the angle variable 27x. We saw 
that ker f= Z. It follows from the first homomorphism theorem that the additive quotient 
group R/Z is isomorphic to the multiplicative group of complex numbers of absolute value 
1: that is, R/Z =T. This means that if you take the real line R mod the integers Z, you roll 
R up into a circle. See Figure 3.4. A 


Figure 3.4 Roll up the real line to get a circle T=R/Z 


Fourier series are representations of periodic functions or equivalently functions on the 
circle R/Z = T. Jean-Baptiste Joseph Fourier invented the subject in the early 1800s while 
studying heat diffusion on a circular wire. Now when one uses computers for everything, 
the continuous circle should really be replaced by the finite circle Z/mZ. We will consider 
the finite Fourier transform in Section 4.2. See also my book [116, Part I]. The end result 
is that using group theory one can compute Fourier transforms fast enough to make them 
essential to modern signal processing. The most important algorithm in the subject is the 
fast Fourier transform or FFT (which was actually found by Gauss in 1805, while computing 
the orbit of the asteroid Juno). Another reference is Barry Cipra’s article [16] discussing the 
top 10 algorithms of the twentieth century. One is the FFT. Fourier’s work was really the 
beginning of applying group theory to practical problems. Mathematicians such as Laplace, 
Lagrange, and Legendre objected to his work as lacking in complete proofs. It took over a 
decade for Fourier’s work on heat conduction to appear. Filling in gaps in the proofs would 
help to inspire the development of real analysis. The finite analog needed only the work of 
Gauss. But group theory itself was only an embryo at the time of Fourier. 


Example 3. Suppose that G=(a) is a finite cyclic group of order n. Define the group 
homomorphism f : Z— G by f(k) =a*, Vk €Z. It is easily checked that f is indeed a group 
homomorphism. What is the kernel of f? 


ker f={k€ Zla*= a°} = {na|x €Z} =nZ. 


Here we used some of the facts about powers in cyclic groups from Section 2.5. Thus, 


by the first isomorphism theorem, any cyclic group of order n is isomorphic to the group 
Z, = Z/nZ (under addition mod n). A 


Exercise 3.5.8 State whether each of the following statements is true or false and give a 
brief explanation for your answer. 


(a) Consider the integers Z as a group under addition and define the function f:Z —>Z by 
f(x) =2?, Vx € Z. Then f is a group homomorphism. 
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(b) Consider Z3 as a group under addition mod 3. Define f:Z3— Z3 by f(x) =x , Vx E Zs. 
Then f is a group homomorphism. 


Exercise 3.5.9 (Third Isomorphism Theorem). If H is a normal subgroup of K, and H and K 
are normal subgroups of G, then G/K is isomorphic to the group (G/H)/(K/H). 


Hint. Use the first isomorphism theorem starting out with a map T: G/H— G/K, defined 
by T(gH) = gK, for ge G. 


Exercise 3.5.10 Consider the multiplicative group R* of positive real numbers and the 
following functions f mapping R* into R*. State whether each f is a group homomorphism. 
If so, find the image subgroup f(IR*) and kernel for each function f. 


(a) f(x)=2’. 
(b) f(x) = 3x. 
(o) fla) = ve. 
(d) f(x) =1/s. 


Exercise 3.5.11 Suppose that G is a finite group of prime order |G|= p and f:G—G is a 
group homomorphism. If e is the identity of G, then either f(x) =e, for all xEG, or f is 
a group isomorphism. Why? Is it true that in the second case f(x) =x, for all xe G? 


Exercise 3.5.12 Given an m x m real matrix A, define a function E,:R—R™*™ by 


You may assume that this series converges for all real numbers t. Show that E, is a 
group homomorphism from the additive group R to the multiplicative group GL(m,R) 
of non-singular m x m real matrices. A non-singular matrix is a square matrix with 
nonzero determinant or, equivalently, a matrix with an inverse for multiplication - see 
Exercise 7.3.13. The image of E, is called a one-parameter group. The function is also 
named the matrix exponential exp(tA). It is useful in the solution of differential equations, 
and generalizations are important in the theory of Lie groups like GL(n, R). 


Exercise 3.5.13 (Second Isomorphism Theorem). Suppose that K and N are subgroups of 
the group G and that N is normal in G. Show that K/NQ K is isomorphic to KN/N. 


Hint. Use the first isomorphism theorem applied to the map T: K-+KN/N defined by 
T(k) =RN for REK. 


Exercise 3.5.14 Let G be a group and consider Z,, as a group under addition (mod n). Show 
that any group homomorphism T:Z,, + G is completely determined by the value T(1 mod n). 


Exercise 3.5.15 For n> 2, consider Z,, as a group under addition (mod n) and let T denote 
the multiplicative group of complex numbers of absolute value 1. Define the map 7: Z, + T 
by r(x (mod n)) = es, for x Z. Show that T is a 1-1 group homomorphism. Show that 
the image of 7 is the group of nth roots of unity in C, that is, all the complex roots of 
1. 
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3.6 Building New Groups from Old, II: Direct Product of Groups 


In the following definition we are inspired by the group R? consisting of vectors in the 
plane with addition defined by adding vectors in the usual way by adding componentwise. 


Definition 3.6.1 Given two groups G and H, we build a larger group consisting of ordered 
pairs called the direct product (or sum) of G and H denoted G @ H (or G x H), defined 


by G@ H= {(g, h)|g€ G,he H} with the operation of componentwise multiplication: 
(9, A)(g', h’) = (gg hh’). 


It is pretty easily checked that the preceding definition makes G @ Ha group. The identity 
is (e, e’), where e is the identity of G, e’ is the identity of G’. Inverses are given by (g,h)~! = 
(g~', h—'). The associative law follows from the associative laws for G and H. 


Exercise 3.6.1 Prove the claims in the preceding paragraph. 


Example 1. The usual Euclidean plane, R? = R@ R, where we consider R to be a group 
under addition. Then R? is a group under vector addition in the plane. Of course, it actually 
has more structure than that as it is a vector space. See Section 7.1. A 


Example 2. The Klein 4-group, Z, 6 Z, ~Z3, which consists of ordered pairs of Os and 1s. 
The elements are: (0, 0), (0, 1), (1,0), (1, 1). The operation is componentwise addition mod 
2. Thus (0, 1) + (1, 1) =(1,0), for example. It is easily seen that each non-identity element 
has order 2. This group is isomorphic to Z as well as the group with multiplication table 
given in Figure 2.14. A 


Similarly one can take direct products of any number of groups, just as happens in 
calculus, when you form R” = n-dimensional Euclidean space, for n> 3. Of course, we can 
take direct products in which each component comes from a different group. There are lots 
of applications. Computers tend to like groups such as 


L5 =Ly, OZ, OZ, @-+: Ly. 
a 


n copies 


Binary error-correcting codes are subgroups of this group. We will consider them later. 
There are also applications in cryptography and genetics. 

It is also possible to consider direct products of infinitely many groups. Then one would 
have vectors with infinitely many components. For example, sequences {2,},,5, of real 
numbers x, € R can be added via {x} + {vn} ={2n + yn} to get a group which might be 
called R°°. We will not consider such groups here. 

Cayley graphs attached to some of these groups are shown in Figures 3.5, 3.6, and 3.7. 
The last image is a four-dimensional hypercube or tesseract. These graphs are undirected 
because the generating sets S have the property that s€ S implies s~'€ S. 

Next we consider a couple of questions. 


Question 1. Let G and H be finite groups. Suppose a€ G and b € H. What is the order of 
the element (a, b)€ G@ H? 
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Figure 3.5 Cayley graph for Z2 @ Z2 with the 
generating set {(1, 0), (0, 1)} and bears at the 
vertices 


Figure 3.6 Cayley graph for Z, © Z, ® Z, with 
the generating set {(1,0,0), (0, 1,0) , (0,0, 1)} 
and koalas at the vertices 


Answer. Recall Definition 3.1.2 of the least common multiple r=lcm|m,n] if m=|a| and 
n=|b|. Then |(a, b)| =1cm(|al, |b]). 


Proof. First note that 


(a, b)* = (a*, b*) =(e, e’) —> ak = e= identity of G and b* = e’ =identity of H. (3.3) 


Let k be the order of (a, b). The equivalence (3.3) implies m= |a| divides k and n = |b| divides 
k. So r=lcm|m, n] is less than or equal to k, the order of (a, b). 

On the other hand, as r=Icm[m, n] is a common multiple of m and 1, by (3.3), replacing 
k by 4, we see that (a,b)” =(e, e’) and thus k, the order of (a,b), divides r So R<r<k, 
which means that r=k. A 


Question 2. Suppose that we have two finite cyclic groups G=(a) and H= (b). When is 
G © H cyclic? 
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Figure 3.7 Cayley graph for Z2 6 Z2 ® Z2 © Za 
with the generating set 
{(1,0, 0,0), (0, 1,0,0), (0,0, 1,0) ,(0,0,0, 1)} 


The hypercube or tesseract = (Z,)* 


Answer. Recall Definition 1.5.3 of the greatest common divisor. Then the answer to question 
2 is that G @ H is cyclic if and only if ged(|H], |G|) = 1. 


Proof. Let m= |G| and n=|H]. Suppose that gced(m, n) = 1. Then (a, b) has order mn using 
the answer to question 1 and Exercise 3.1.7, which states that mn = gced(m,n)lcm[m, nJ. 
This implies that G @ H is cyclic. The converse is an exercise. A 


Exercise 3.6.2 Prove the converse part of the previous result. 


Example 3. Suppose that m and n are positive integers with gcd(m,n)=1. We have a 
group homomorphism f: Z > Zm ® Zn, defined by f(x) = (x + mZ, x + nZ). Here the group 
operations are addition. It is easily seen that f(x + y) =f{x) + fy). We leave this to you to 
check as an exercise. 

Note that ged(m, n) = 1, ker f= {x | x=0 (mod m) and r=0 (mod n)} = mnzZ. The first 
isomorphism theorem (Theorem 3.5.1) then says that Z/mnZ (or Zn») is isomorphic to the 
image of f which is a subgroup of Z, © Z,, but since both Z/mnZ and Zm @ Z, have mn 
elements, it follows that the image of f must be all of Z,, 6 Z,. Therefore we see that, as 
additive groups, Zinn =Zm © Z, if ged(m,n) = 1. A 


Exercise 3.6.3 Assume ged(m, n) = 1. Define f:Z>Zm®Zy by f(x) = (x + mZ,x+ nZ) 
and show that f(x+ y)=f(x) +f(y). Then show that, as additive groups, Z, ®Z, and 
Zimn are isomorphic - using the first isomorphism theorem from the preceding section. 


In Section 6.2 we will see that the last exercise is equally true when we consider Z and 
Zm ® Z, as rings (which means that the maps preserve the operation of multiplication as 
well as addition). This is called the Chinese remainder theorem. A version of it was found 
by the Chinese mathematician Sun Tsu in the first century ap. The main point of the theorem 
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for the number theorist is the onto-ness of the map f which is proved very explicitly by 
constructing solutions to the simultaneous congruences 


x =a (mod m) 


f= nod ®) \ when ged(m,n) = 1. 


There is an old Chinese song which describes one construction of a simultaneous solution 
to three congruences. I explain it in my book [116, p. 14]. 

The Chinese remainder theorem has numerous computer applications: for example, in 
writing programs to multiply very large integers. One might also find it astounding that it 
seems to say that a group like Z,,,,, which we might think of as one-dimensional - at least 
when viewing its Cayley graph for the usual generating set - can be identified with a group 
Zm © Zn, With a two-dimensional Cayley graph - at least when gced(m,n) = 1. Of course, 
groups are not vector spaces, so that dimension does not really mean anything. We define 
dimension of a vector space in Section 7.1. Moreover there are many ways to draw a given 
Cayley graph as well as many different Cayley graphs associated to a given group. 

Consider the Cayley graph X(Zio © Zs, {(£1, 0), (0,+1)}) in Figure 3.8. This can be 
viewed as a finite torus. The real torus (or doughnut) in Figure 3.9 is found by looking at 
the quotient of the plane R@® R modulo Z@ Z. Note that we cannot claim that Zio @ Zs 
is isomorphic to Zco, since Z19 © Zs; has no element of order 50. 


Figure 3.8 A finite torus, which is the Cayley 
graph X(Z1o © Zs, {(+1,0), (0, +1)}) 


Figure 3.9 The continuous torus (obtained from 
the plane modulo its integer points; ie, ROR 
modulo Z @ Z) 


Exercise 3.6.4 Suppose that G and H are groups. Define a map T:G® H> G by T(x, y) = 
x, for all xe G and all y€H. The map is called the projection onto the first coordinate. 
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Show that T is a group homomorphism. What is ker T? What does the first isomorphism 
theorem say? 


Exercise 3.6.5 Give four examples of pairwise non-isomorphic groups of order 8. Explain. 


Exercise 3.6.6 State whether each of the following statements is true or false and give a 
reason for your answer. In each of the last three statements the group operation on Z,, is 
addition mod n. 


(a) If G and H are groups, then G® H is isomorphic to H & G. 
(b) The order of (3,2) in Zi2 © Zy is 4. 
(c) Z; @ Zs has exactly six subgroups of order 5. 


1 a b 
(d) The group of 3 x 3 matrices |0 1 O|}, with a,b €Z, is isomorphic to Z, ® Zn, for 
00 1 


any n> 2, where the operation is matrix multiplication. 


Exercise 3.6.7 Is the order of ab equal to the product of \a| and |b| for a,b in any finite 
group G? Explain. 


Exercise 3.6.8 Show that the group C of complex numbers under addition is isomorphic to 
the group R® R. 


There is one more group of order 8 (beyond those in Exercise 3.6.5). This group does 
not come out of products of groups or dihedral groups. It is the quaternion group. It is 
a subset of the quaternions invented in 1843 by William Rowan Hamilton to generalize 
the complex numbers C = R @iR to four dimensions. Well, he actually wanted something 
three-dimensional over R. But that proved to be impossible. He had to give up commuta- 
tivity of multiplication also. He created the quaternions while walking with his wife along 
a canal in Dublin, Ireland. Then he carved the equations into a bridge - now a tourist 
destination, which I unfortunately missed when I was in Dublin. The space of quaternions is 


H=ROiR@OJRORR, where ? =j*=k? =ijk=—1. (3.4) 


See Stewart [114] for more of the story of Hamilton and his quaternions. At the moment 
we are just interested in 8 of these quaternions in order to get the group considered in the 
following exercise. 


Exercise 3.6.9 Consider the quaternion group Q= {+1, +i, +j, +k}, with? =/ =k? =ijk= 
—1. Create a multiplication table for this group of order 8. Show that Q is not isomorphic 
to D4. Check that the Cayley graph of Q with generating set S = {+i,+j} can be drawn as 
in Figure 3.10. 


Exercise 3.6.10 Consider the quaternion group Q from the preceding exercise. Find three 
subgroups of Q of order 4 and state whether they are normal subgroups. 


Exercise 3.6.11 Is Z, ® Z, isomorphic to Z,,? Why? 
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Figure 3.10 Cayley graph for the quaternion group with 
generating set {+i, +j} 


The direct products considered above are the so-called external direct products. An 
internal direct product inside a group G will be a subgroup of G isomorphic to the external 
direct product of two other subgroups of G. Suppose H and K are subgroups of G. Then HK 
will be a subgroup of G as well as an internal direct product if H1 K={e} and both H 
and K are normal subgroups of G. For then it can be shown that the map T:HK> H@®K 
defined by T(hk) = (h,k) is well defined and a group isomorphism. 


Exercise 3.6.12 Prove that if H and K are normal subgroups of G such that HN K = {e}, then 
the map T: HK +H © K defined by T(hk) = (h, k) is well defined and a group isomorphism. 


Exercise 3.6.13 Show that the groups G and H are both commutative if and only if G@H 
is commutative. 


Exercise 3.6.14 Consider the groups Zeo, Z30 © Zz. How many elements of orders 2,3,4,5 
does each group have? 


Exercise 3.6.15 Suppose a and b are elements of the Abelian group G with finite orders |a| 
and |b| such that gcd(|a| ,|b|) = 1. 


(a) Show that |ab| = |a| |b]. 
(b) Is it true that (under the hypotheses of this problem) the subgroup H= (a,b) is 
isomorphic to (a) © (b)? Why? 


Exercise 3.6.16 If g is an element of some group G with finite order |g| =1, show that 
|= r __ Iem(r, t) 


Exercise 3.6.17 Suppose a and b are elements of the Abelian group G with finite orders |a| 
and |b|. Show that there is an element of G with order the least common multiple lcm||al, |b||. 


Hint. Suppose we have the prime factorizations of the orders: |a|= Te and |b| =| |e. 
where the p; are pairwise distinct prime numbers. Define 


a= {5 ife <fi, dL e if fi< ej, 


0 otherwise. ‘0 otherwise. 
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Then let r=] [i and s=| |p". It follows that ab’ is the element we are seeking - 
using the preceding two exercises. 


Exercise 3.6.18 Suppose that ged(m,n)=1. Define a map T:Z*,—>Z*,®Z* by 
T(x mod mn) = (x mod m,x mod n). Show that T is a well-defined group homomorphism 
between these multiplicative groups. 


Exercise 3.6.19 What happens to the preceding exercise if we do not assume that 
ged(m, n) = 1? 


3.7 Group Actions 


We have already seen many examples of groups acting on sets. Here we want to prove 
something usually called Burnside’s lemma though it was first found by A.-L. Cauchy in 
1870 and then by F. G. Frobenius in 1900 and finally by W. Burnside in 1910. We will 
persist in naming the result after Burnside, since he did publicize it in his book on group 
theory. This lemma has applications in chemistry for counting certain kinds of chemical 
compounds. There are also applications to counting switching circuits, which are the sort 
of circuits that are the basis for computers. See Dornhoff and Hohn [25, Section 5.14]. The 
result leads immediately to work of Redfield in 1927 and Polya in 1937 - which is often 
called Polya enumeration theory. In Section 4.4 we consider an application to counting 
sudoku puzzles. 


Definition 3.7.1 A group G is said to act on a set X on the left if for every o € G there 
is a function 0: Gx XX, written O(0,x)=0-x=07%, for alla € G, x EX, with the 
following two properties: 


1. (ot) x=0 (rx), for allo,r€ Gand xe X; 
2. if e is the identity of G, then ex=x, for all x EX. 


The action given in the preceding definition is a left group action. One can similarly 
define a right group action of the group G on a set X written xo or x’, for re X,0 € G. In 
the case of a right action, the two defining properties become re = +, for the identity ein G 
and (1a )r = x(oT), for all xe X, o, 7 € G. You can essentially switch a left action to a right 
action by replacing o € G with o~’ in the formula for O(c, x) in Definition 3.7.1. 


Example 1. The dihedral group D, acts on the regular n-gon. We examined the case n= 3 
in Figure 2.1. A 


Recall that we noted that there is often confusion about actions being either left or right 
actions. It is a right-left confusion that many people (the present author included) have. 
The important property to check is that of property 1 in Definition 3.7.1. We saw already 
with Example 1 in Section 2.1 - the dihedral group D3 - that confusion is quite natural. 
Recall the balls and buckets problem. Are you moving the balls or the buckets? This is really 
the question of right versus left actions. 
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Example 2. The group G acts on itself by left multiplication (see Definition 2.3.2), L,r= gx 


for all g,x<G. The group also acts on itself on the right via R,xr= xg. A 
Example 3. The symmetric group S, acts on the set X= {1,2,...,n}viao - x= 0(x), for all 
rex. A 


Exercise 3.7.1 Show that the group G acts on the set of functions X= {f: G— C} by setting 
(Lgf) (x) =f(g7!x), Vx, g € G and fe X. This is the left action. There is a similar right action. 
Define it. 


Exercise 3.7.2 Define S(X) to be the set of 1-1, onto functions 0: X +X. Show that S(X) is 
a group under the multiplication given by composition of functions. Then show that if group 
G acts on set X, via Ts(x)=0- x, foro €G and x EX, we have a group homomorphism 
T:G— S(X) given by T(c) =T.g, for all ge G. Note that |X| = n implies that we can identify 
S(X) with S,. This generalizes Cayley’s theorem. 


Exercise 3.7.3 Check that the action of S, on {1,2,...,n} defined in Example 3 is indeed 
a left action. 


Example 4. The symmetric group S, acts on polynomials in n indeterminates 1,,...,%n, 
with real coefficients. The action is given (as seen already in Definition 3.1.3) by 


(oP) (Byars aa) =P js) * (3.5) 
For example, suppose that n =4, 


P(x1, 42,43, 44) =11%3 — XX, 


o =(12)(34), 
7 = (123). 
Then 
oP(X1,%2,%3,X4) = X4X4 — 4143 = —P(4y,%2,%3,24). 


We find that 
TP(X1, X02, X3,X4) = 4X1 — 13X4. 
Now To = (123)(12)(34) = (134). We compute (7) P two ways. First 
(To) P(x1, X2 43,44) = 13%4q — MX} 
and second 
T (OP(41, ¥2,43,X4)) =T (A2%4 — 1113) = 43X4 — 1X1. 


Mercifully they are indeed the same. Now prove that property 2 of a left action holds in 
general for the action in (3.5), as an exercise. A 


Exercise 3.7.4 Show that the action in equation (3.5) is indeed a left action. 
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Example 5. Recall that the general linear group appeared in Exercise 3.5.5 as well as Exam- 
ple 3 of Section 2.3 (for 2 x 2 matrices with entries in the integers mod 2). The general linear 
group GL(n, R) of non-singular real n x n matrices M acts on column vectors + €R” via 
M. x= Mx. Here Mx means matrix multiply the n x n matrix M times the n x 1 matrix x. 


A 

Example 6. The symmetric group S,, acts on vectors (v,,...,v,) € R" by 
(oc) (Vg..29a) = (Cyr Be164 (3.6) 
A 


Exercise 3.7.5 Show that the action of S, on IR" given in equation (3.6) is indeed a left 
group action. 


Hint. One way to do this is to note that we can identify R" with functions f: {1,...,n}—>R 
by identifying the function f with the vector (f(1),...,f()) €R". Then use the method of 
Exercise 3.7.1. 


We have already seen a special case of the following definition in the proof of 
Proposition 3.1.1. 


Definition 3.7.2 Assume that group G acts on set X. The orbit of x<¢ X is Orb(x) = 


{ox | o EG}. 


Lemma 3.7.1 Assume that group G acts on set X. We have an equivalence relation on the 
set X given by x~y iff y € Orb(x). Of course, the equivalence classes are the orbits. 


Exercise 3.7.6 Prove the preceding lemma. 
Exercise 3.7.7 Given a group G, we get a group action of g € G on G itself via conjugation: 


T,(x) = gxg~', for any x € G. Show that this does indeed define a group action of G on G. 
What are the orbits of this group action? 


The following definition is needed for our discussion of Burnside’s lemma. 


Definition 3.7.3 Assume that group G acts on set X. The stabilizer of «© X is 
Stab(z)={o €G | or=z} . 


The fixed points of o € G are 
Fix(a) ={xeX | ox=x}. 


Exercise 3.7.8 Show that Stab(x) is a subgroup of G. 


Exercise 3.7.9 Consider Example 4 in this section - the symmetric group S4 acting on 
polynomials in four indeterminates, with real coefficients. Find the orbit of the polynomial 
P(Xy ,X2,%3, 4) = 4X3 — Xx, under S,. What is the order of Stab(P) in S,? 
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Polynomials in n indeterminates over any field F that are fixed by every element of S, 
are called symmetric polynomials. Favorites are the elementary symmetric polynomials: 


n 
Sj= y Xj, S.= y LiXj, S3= y HjNjLk, 6-0, Sn =H Xp. (3.7) 
i=1 


1<i<j<n 1<i<j<k<n 


The fundamental theorem on symmetric polynomials says that any symmetric polynomial 
is a polynomial in the elementary symmetric polynomials. See Dummit and Foote [28] or 
Herstein [42] for more information. 


Proposition 3.7.1 (Orbit/Stabilizer Theorem). Assume that the finite group G acts on the 
finite set X. Then, for any x EX, |Orb(x)| =| G| / |Stab(x)]. 


Proof. We can define a 1-1, onto function F: G/Stab(x) > Orb(x) via F(oStab(x)) = ox. 
We leave it as an exercise to prove this function is well defined, 1-1, and onto. A 


Exercise 3.7.10 Prove that the function F defined in the proof of Proposition 3.7.1 is well 
defined, 1-1, onto. 


Example. If H is a subgroup of the group G, then H acts on G by h- g=gh"', for hc H 
and g€ G. The orbits are the left cosets gH. The orbit/stabilizer theorem states that |G/H| = 
|G| /|H|. We noticed this earlier in part (3) of Proposition 3.3.1. A 


Exercise 3.7.11 Find the order of the group of motions of a tetrahedron by computing 
|\Orb(f)| and |Stab( f)| for any face f. 


The following lemma is usually given the name Burnside’s lemma - but it does not appear 
to be due to Burnside, as we noted earlier in this section. This lemma says that the number 
of orbits of a finite group G acting on the finite set X is the average order of the sets Fix(c), 
when averaged over o € G. 


Lemma 3.7.2 (Burnside’s Lemma). Assume that the finite group G acts on the finite set X. 
Then the number of orbits of the group action is 


#orbits = pe |Fix(c)| . 


oc€G 


Proof. By Proposition 3.7.1 


a 1 yo |Stab(x)| 
#orbits dome »S ci 


It follows that 
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Now interchange the two sums to get 


‘ 1 1 ‘ 
#orbits = iq S> 1= iq |Fix(a)|. A 
ocG 


o€G xEX 
ov=" 


Example. The group G acts on G by conjugation: C,(x) = gxg—', for x,g€ G. We already 
studied conjugation in Section 3.2. The orbit of Gis a conjugacy class {x} = {gxg~! | ge G} 
as in Exercise 3.7.7. Then the stabilizer of r€G is Stab(x) =Ce(x) = {ge G| xg= gr} = 
the centralizer of x in G. The center of Gis Z(G) = {g € G | gx =xg, VxE G}. Suppose that 
X1,...,%,¢ Z(G) represent all the distinct conjugacy classes in G except those coming from 
the center. One has the class equation 


|G| 
ICe(i)] 


This result is pretty obvious, since Gis a disjoint union of its conjugacy classes as conjugacy 
is an equivalence relation. Thus the number of elements of G is the sum of the orders of 
the conjugacy classes. The conjugacy classes of elements of the center of G have just one 
element. The order of a conjugacy class is found using Proposition 3.7.1. A 


(3.8) 


IG] =|Z(G)| + Pai} =12(G)1 + > 


We can use equation (3.8) to prove the following theorem of Cauchy. 


Theorem 3.7.1 (Cauchy). If a prime p divides the order |G| of a finite group G, then G 
has an element of order p. 


Proof. There are two cases (and each case has two subcases). Both cases use induction on 
|G|. The theorem is certainly true if G has order 1 or 2 - or even any prime. 

Case 1. The group Gis Abelian. If G contains no proper subgroup then G is cyclic and has 
an element of order p by Theorem 2.5.1. Otherwise G has a proper subgroup H. If p divides 
|H|, we are done by the induction hypothesis. Otherwise, since p divides |G| = |G/H| |HI, it 
follows that p must divide the order of |G/H|. Thus G/H must contain an element of order 
p by the induction hypothesis. But this implies G must also contain an element of order p. 
For a proof, see the exercise below. 

Case 2. The group G is not Abelian. Recall that Cg(g) = {xe G | xg= gx}. Suppose that p 
does not divide |G/Cg(x)| for some x € G, x ¢ Z(G). Proposition 3.3.1 says |G|= |G/C¢(x)| - 
|Cg(x)|. Thus p must divide |Cg(x)|, since p divides |G|. Then by induction, since |Cg(x)| < 
|G| (Why?), we know that Cg(x) has an element of order p and thus so does G. 

Our last subcase is that p divides | G/Ce(x)| =|G|/|Ce(x)| for all xe G, x ¢ Z(G). Then by 
the class equation (3.8), p divides every term in the sum as well as the left-hand side, which 
implies that p divides |Z(G)|. Then by induction because G is not Abelian and |Z(G)| < |GI, 
we know that there must be an element of order p in the center of G. A 


Exercise 3.7.12 


(a) Prove the last statement in Case 1 of the proof of Theorem 3.7.1. 
(b) Answer the “Why?” in Case 2 of the proof of Theorem 3.7.1. 


Hint for (a). gH has order p ==> g? =h€ H and g' ¢ H if 1<k<p. If |H| =n, we know p 
does not divide n. What is the order of g"? 
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The preceding theorem is a special case of a theorem proved in 1872 by Ludwig 
Sylow, a high school teacher in Norway. To explain this theorem, some definitions are 
needed. For a prime p, a p-group means the group has order a power of p. One defines 
for a prime p, a Sylow p-subgroup of a finite group G to be a maximal p-subgroup of G. 
Sylow’s first theorem says that, for any prime p, if for a finite group G, we write, |G| = p‘n, 
with e> 1, gcd(p,n) = 1, then there is a subgroup H; of G with |H;|=p', for 1<i<e. In 
particular, there is a Sylow p-subgroup of G of order p‘. Sylow’s second theorem says if H 
is a subgroup of the finite group G and |H| = p’, for some prime p and some power i, then 
H is contained in some Sylow p-subgroup of G. Sylow’s third theorem says if |G| = p‘n, 
with e> 1, gced(p,n) =1, then all Sylow p-subgroups are conjugate and the number N, of 
Sylow p-subgroups is a divisor of |G| and N, =1 (mod p), for fixed p. We will not prove 
the Sylow theorems. You can find proofs in most algebra books: for example, Gallian [33] 
or Dummit and Foote [28]. 

Define the normalizer Ng(H) of a subgroup H in a group G to be 


No(H) ={9€G| gHg' cH}. 


The normalizer of a subgroup H is a subgroup of G. Then one can also show that if P 
denotes a p-Sylow subgroup of the finite group G, it follows that, if Ny is defined as in the 
preceding paragraph, N, = |G/Ng¢(P)|. 


Exercise 3.7.13 Prove the last statements, assuming the Sylow theorems. 


Hint. The conjugation mapping c,(x)=gxg—', for g€G, takes subgroups to conjugate 
subgroups. Then the stabilizer of subgroup H is Nc(H). The result follows from the 
orbit/stabilizer theorem. 


Examples: Applications of Sylow Theorems 


1. The group S3; = D; = {I R, R?, F, FR, FR} has three Sylow 2-subgroups (F), (FR), and 
(FR’). This group has one Sylow 3-subgroup (R). This checks with Sylow’s third theorem, 
since |S3|= 2-3 and Nj =1 (mod 2) means N2 must be 3 as N2 must also divide 6. 
Similarly N,= 1 checks with N, = 1 (mod 3). 

2. A, has an odd number of subgroups of order 4. This follows from N, = 1 (mod 2). 

3. There is only one group of order 15 and it is cyclic. Sylow says that both N; and N; must 
be 1. That means a group G of order 15 must have one normal subgroup A of order 3 
and another normal subgroup B of order 5. These subgroups have to be cyclic. Moreover 
AM B= {e}. Then recall Exercise 3.6.12. 

4. Suppose G is a group of order 10. Then Sylow’s third theorem says it must have Nz = 
1 + 2q Sylow 2-subgroups and N; = 1 + 5r Sylow 5-subgroups, where Nz and Nz divide 
10. This means that N2 is either 1 or 5 while N; must be 1. This will help us to determine 
the groups of order 10 in Section 4.5. A 


Exercise 3.7.14 Finish the discussion of groups of order 15. 


One can use Burnside’s lemma to count other sorts of things than possible groups of 
certain orders. 


Example. Consider the number of necklaces that one can make using beads with two colors 
located at the vertices of a hexagon. This is equivalent to counting the chemical compounds 
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that can be obtained by attaching H or CH; radicals to each carbon atom in a benzene ring 
(see Figure 0.1 in the preface). You might think at first that there are 2° ways to do this 
but these do not all give different compounds since rotation and flip do not change the 
chemistry or the necklace. Thus we need to consider the group Dg acting on the necklaces. 
Distinct necklaces are distinct orbits (i.e., distinct equivalence classes). We need to make a 
table. To compute the numbers in the last column of the table, see equation (3.10) below. A 


Group element o Number of such elements | |Fix(c)| 
I 1 26 

F, flip across axis between opposite vertices | 3 2" 

F, flip across axis between opposite sides 3 23 

R, R’, R=rotate by = 2 =% 2 2 

R2, R¢ 2 2? 

R 1 23 


From this, we find that the number of orbits is the sum of the numbers in the third column 
of the table divided by the order of the group, that is, = 13. The 13 necklaces are shown 
in Figure 3.11. 

Let us say a bit more about these coloring problems. Suppose that G is a finite group 
permuting elements of a finite set X. And suppose C is a finite set of colors. Colorings of X 
are elements of the set X° = {f: X— C}. There are [x|!¢l such colorings. An element o € G 
acts on fe X° via (of) (x) =f (o~!x), for xe X. Then the Burnside lemma says 


# (G—inequivalent colorings) = > |Fix(o)| = Ly ic, (3.9) 
Gla Cle 

where £(c) is the number of cycles in the disjoint cycle decomposition of o as a permutation 

of elements of X. Here it is assumed that we include all 1-cycles as well so that, for example, 

we write the identity in S, as I= (1)(2)--- (n). Thus £(1) = n. To prove equation (3.9), note 

that if fe X° is fixed by o €G, then fo o'=f, for all powers i. If u,v € X are in the same 

cycle in o, then v =o'(u) for some i. Then f(u) =f(v). So f must be constant on a cycle of 


a. Conversely, if fis constant on any cycle of o, then f€ Fix(c). Why? Moreover, it follows 
that 


|Fix(o)| =|c|?. (3.10) 


Exercise 3.7.15 Give an answer to the “Why?” in the previous paragraph. Then explain 
why equation (3.10) follows from this. 


Exercise 3.7.16 In how many ways can the faces of a tetrahedron be colored with three 
different colors. 


Much more can be said about the use of group actions in counting colorings - a sub- 
ject which is referred to as Polya enumeration theory. See Dornhoff and Hohn [25] for 
more information on this theory and its extensions. This text also gives applications to 
the enumeration of logic circuits and switching functions (now called Boolean functions) 
f: {0, 1}" > {0, 1}. One wants to know, for example, how many essentially different devices 
are needed to implement these switching functions for n = 2*. The answer is that 222 devices 
suffice. See Dornhoff and Hohn [25, p. 263] for more information on this subject. The logic 
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Figure 3.11 The 13 necklaces with six beads of two colors 


circuits and Boolean or switching functions are idealizations of the circuits in the innards 
of our computers. Once these circuits were combinations of transistors, resistors, etc. con- 
nected by wires. Now one chip may contain millions - or billons - of these elements. From 
the 1970s to 2016, the number of transistors on an integrated circuit chip microprocessor 
CPU went from thousands to billions. See Wikipedia on transistor count or the IBM web- 
page. This frightening explosion of complexity requires algebra to deal with it. See also my 
book [116]. We will say a bit more about Boolean functions in Section 8.2. 


Exercise 3.7.17 Use mathematical induction to show that the coefficients of a monic poly- 
nomial of degree n are given by elementary symmetric polynomials in the roots defined by 
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equation (3.7): 


Sly) = (y~ 41 )(y — 22) ++ (an) 
= y" — sy"! + soy”? — sy 3 +--+ (-1)"5n. 


Exercise 3.7.18 Find the number of tiny bracelets of four beads that can be made with two 
colors of beads. 


Exercise 3.7.19 In how many ways can we paint a square floor made up of nine square tiles 
using purple and orange paint? 


Exercise 3.7.20 In how many ways can you color a cube’s faces with four colors? 


Exercise 3.7.21 Consider the group of motions of the regular octahedron (a solid with eight 
faces consisting of equilateral triangles). What is the order of this group? 


Exercise 3.7.22 What is the order of the group of motions of the regular dodecahedron (a 
solid with 12 faces consisting of regular pentagons). See Figure 3.12 for the dodecahedron 
graph. In this figure, one face is stretched out to contain the rest. 


Figure 3.12 The dodecahedron graph drawn by 
Mathematica 


Besides the regular Platonic solids of Figure 2.16, there are the 13 Archimedean solids 
whose faces are congruent regular polygons and whose vertices all have the same number 
of edges emanating from them. One example is shown in Figure 3.13. 


Exercise 3.7.23 Consider the group G of rotations of the Archimedean solid known as the 
cuboctahedron shown in Figure 3.13. This solid is made up of eight identical regular trian- 
gles and six identical squares with each vertex having four edges coming out of it. What is 
the order of the group G? 


Exercise 3.7.24 Find all the conjugacy classes in the dihedral group D; and then do the 
same for D,. Check the class equation (3.8) in each case. 


Exercise 3.7.25 Show that if Gis a group with center Z= Z(G) then Z is a normal subgroup 
of G. 
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Figure 3.13 The cuboctahedron drawn 
by Mathematica 


Exercise 3.7.26 Show that if G is a group with center Z= Z(G) such that G/Z is cyclic, 
then G is Abelian. 


Applications and More Examples 
of Groups 


4.1 Public-Key Cryptography 


One important application of group theory is to be found in an algorithm used to encrypt 
messages which is due to Rivest, Shamir, and Adleman in 1978. This method is called RSA 
cryptography. Perhaps a few definitions are in order. Cryptography is a part of the field of 
cryptology. Cryptology refers to the design of systems for encoding information that needs 
to be kept secret (cryptography) as well as the discovery of mechanisms for breaking into 
such systems (cryptanalysis). In this section we will give only a brief introduction to RSA 
cryptography. More information can be found in various books such as those of Kenneth 
H. Rosen [91], Neal Koblitz [55], or Ramanujachary Kumanduri and Cristina Romero [62]. 
See also the article of N. Koblitz [57] well as that of N. Koblitz and A. Menezes [58]. Or see 
Wikipedia as well as the Mathematica website for RSA encryption and decryption. 

Quite often in modern times one wants to send a message (usually on the web) which 
can only be understood by the recipient. Public-key cryptography allows this to be done 
fairly easily and fairly securely (we hope). Think of your message as a number mod pq, 
where p and q are very large distinct primes that do not divide m. The encryption of the 
message m is just m' (mod pq) for some power t. To decrypt, one needs a power s so that 
m* =m (mod pq). From what we know about the group Ziq: We know that we need to 
solve ts=1 (mod ¢(pq)), where ¢ (pq) =|Z;,|. Here ¢ is Euler's phi-function considered 
in Section 2.3. To find s, it therefore seems that you need to know ¢(pq) = (p— 1)(q-1). 
See Exercises 2.3.10 and 3.6.18. 

What happens is that anyone who wants to receive a secret message chooses p, q,t and 
publishes t and pq. The public key is (t, pq) and the secret key needed to decrypt the message 
is s. Anyone who wants to send a secret message m will compute m' (mod pq) and send 
this number. Why is it that a third party will be unable to compute m from this public 
knowledge? It is possible to test a large number to see that it is prime with a relatively 
small amount of computer time. The tests are probabilistic and thus there is a tiny chance 
of failure. See Kumanduri and Romero [62, Chapter 6] or Rosen [91]. The computational 
fact that enables RSA cryptography is that (at the moment) if p and q are huge primes, no 
one knows how to factor pq in a reasonable amount of computer time and then no one can 
compute $(pq) = (p — 1)(q — 1). The security of the message thus depends on the difference 
in speed of primality testing versus factoring. If this situation should ever change, the RSA 
codes would not be secure. 

The primes p and q must be VERY large for the RSA code to be secure. The RSA website 
(www.rsa.com) has a discussion. In 2011 the suggested size of the modulus n = pq was 1024 
bits (meaning Os and 1s needed to express the number pq base 2) and thus since 2'°7° — 1 = 
3.595 4 x 10°°8 the number of decimal digits was something like 308. You would not like 
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to see such a large number written down here. The RSA website said such a key was OK 
for corporate uses but not for extremely valuable keys. More recently one finds that the 
National Security Agency (NSA) is worried that someday soon quantum computers will 
exist and much larger primes would be necessary. See the Wikipedia article on key size. 

It is somewhat frightening to one born in 1942 that a googol is only 10!™. I once thought 
that a 9-digit social security number was too large to factor easily. Thus, in the 1970s 
I was impressed when someone could factor their social security number on their HP-65 
programmable calculator. 

Other groups than Z,, can be used for cryptography. In particular, one can use the group 
of points on an elliptic curve. We will say more about these things in Section 8.5. 

Here we give a toy example of RSA cryptography with tiny primes and the encoding of 
letters modelled on examples in Rosen’s book [91]. First we make a table of letters from 
A to Z and the corresponding pairs of numbers 00 to 25. 


A |B C |D |E F G |H JI J K | L M 
00 | 01 | 02 | 03 | 04 | 05 | 06 | O7 | 08 | O9 | 10 | 11 | 12 
N |O |P Q|R |S T |U JV |W] X | Y |Z 

13 | 14} 15 |] 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 


We assume the public key is (n= pq, t) = (37 - 59, 23). We need gcd(¢(37 - 59),23) = 1 
and that is indeed the case - since ¢(37 - 59) = 36 - 58 and gcd(36 - 58, 23) = 1. Note that 
37 -59=2183. 

Before encrypting anything, we need to address the problem of modular exponen- 
tiation. We will need to find 13047? mod 2183, for example. To do this, you might 
want to compute the integer 130473. But that would be a really huge number since 
log,, 130473 = 71.65. In the old days (when I was young) this number would crash your cal- 
culator. Despite this humongous size, Mathematica says it can handle a number this large. It 
says Mod [13047*?, 2183] =404. On February 24, 2011, I saw that on a PC running Linux, 
Mathematica said it could deal with a number whose log base 10 is over 300000 000. 

Mathematica also has something called PowerMod. The command PowerMod 
[1304,23,2183] yields the same answer 404 and it should work better than just Mod 
since it is clever enough to compute inverses and square roots when they exist. 

The number theory algorithm you would have to use if you were found on a desert island 
goes as follows. First you would write the exponent 23 in base 2 as 23=1+2+4+ 16. 
Then you repeatedly square and reduce mod 2183. This gives the following computation. 
Since we keep reducing mod 2183 the numbers never have more than four digits. 


i 1304' (mod 2183) 
1 | 1304 

2 | 13044 = 2024 

4 | 2024? = 234 

8 | 2347=181 

16 | 1817=16 


Then to obtain 13047? = 13041+7+4+16 — 1304 . 1304? - 1304* - 1304'© = 1304 - 2024 - 234- 
16 = 404 (mod 2183). Mercifully we do not really need to do this if Mathematica (or some 
similar software) is available. More discussion of this algorithm can be found in Rosen [91]. 
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Exercise 4.1.1 Use the Mathematica command Mod[a,n] and repeated squaring to com- 
pute 210473 (mod 2183). Then check your answer using PowerMod[b,p,n]. Or do a 
similar computation using SAGE or whatever is your favorite program for computing 
modulo n. 


We want to encrypt a message: NEVER GIVE UP. Take the message and translate the 
letters to their numerical equivalent. Form blocks of two letters. Add Xs as necessary to 
make the last block of four numbers (two letters). 


NE VE RG IV EU PX 
1304 2104 1706 0821 0420 1523 


Blocks are used to avoid making the message easy to decipher by decoding using the 
frequency with which various letters should occur in a message. 

To encipher a block B replace it by E(B) =C=B' (mod n). I am doing this using 
Mathematica. Of course there are many other programs that would work, but maybe not as 
easily. 

So we need to compute stuff. 


130473 = 404 (mod 2183), 210477= 1867 (mod 2183), 17067? =950 (mod 2183), 
82173 = 1304 (mod 2183), 042077 =964 (mod 2183), 152373= 1215 (mod 2183). 


This means that the encrypted message is 0404 1867 0950 1304 0964 1215. 


Exercise 4.1.2 Check that the decryption of 
0404 1867 0950 1304 0964 1215 
is indeed NEVER GIVE UP. 
Exercise 4.1.3 Suppose that the public key is still (n= pq, t) =(37- 59,23). You need to 
decrypt the message: 


0404 1867 0551 0496 1337 0643 0026. 


Hint. First you must find s=t~!=t*! (mod ¢(pq)). Mathematica can do this via 
PowerMod|t, —1, EulerPhilp « qJ]. 


Once you replace the blocks B by BS (mod 2183), you will still need to translate the pairs 
of numbers to letters using the table above. 


Exercise 4.1.4 Using the same public key as in the preceding exercise, what is the encrypted 
version of the message GROUPS RULE? 


Our encryption/decryption algorithm depends on the assumption that the block B in a 
message to encrypt or decrypt satisfies gcd(B, n) = 1, where n= pq, for distinct primes p 
and q. Thus, for example, any message including the block AA would be troublesome. It is 
possible to encrypt such a block B, but decryption would not seem to make sense. However 
Kumanduri and Romero [62, p. 137] note that as long as you have a message B that is not 
divisible by both p and gq, you can decrypt. Sadly we did not really manage to insure this will 
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not happen in our toy example since pq < 2525 and AA is a possible block. Nevertheless let 
us see how the argument goes. Suppose that ged(B, p) = p and B= kp, where ged(k, q) = 1. 
Then you can still decrypt B’ as B“ = B (mod pq), where s=t~! (mod ¢(pq)). So we have 
two congruences: 


BS =B (mod q), 
B‘=B=0 (mod p). 


Then it follows that B’ = B (mod pq). 


Exercise 4.1.5 As we said in the preceding paragraph, our encryption/decryption algorithm 
depends on the assumption that the block b in a message to encrypt or decrypt satisfies 
ged(b, n) = 1, where n= pq, for distinct primes p and q. What is the probability that b does 
not satisfy this condition? Estimate this probability for the primes p=37, q=59; then for 
primes p,q > 101. 


Hint. Note that the set of numbers O<b<x which are divisible by p is 
{0, p, 2p, 3p,---, |z|p} . Then recall the inclusion-exclusion principle in Exercise 1.8.17. 


Exercise 4.1.6 Suppose that the public key is still (n= pq, t) = (37 -59, 23). You need to 
decrypt the message: 


0429 1384 1150 2037 1473. 


Exercise 4.1.7 Assume, as usual, that p and q are distinct large primes. Also assume that 
p>q. Show that finding (pq) is not easier than factoring m= pq by showing that 


p+q=m-—o¢(m)+1 and p—q= /(pt+q? —4m. 


Then show that - once m and $(m) are known - you can easily find p, also q. 


There are also precautions that must be taken when choosing the primes p and q in order 
to avoid factoring tricks that might be applied to pq. Both p — 1 and q — 1 should have at 
least one large prime factor and ged(p — 1, q— 1) should be small. Also p and q should not 
be too close together. There is an example showing why this last restriction is made in the 
exercise below based on a problem from Koblitz [55, p. 93]. 


Exercise 4.1.8 Suppose that our large primes are p and q with p> q. If n=pgq, then 


mo () 4) 


Let a=(p+q)/2 and b=(p-—4q)/2. Both a and b are integers. Assume that b is small. 
Then b? =a? — n= pq/2. You can test integers a> \/n to see if a* — n is a perfect square. 
That will give you a and b, which in turn give p=a+b and q=a — b. 


(a) Use this method (which goes back to Fermat) to factor 777 923. 
(b) Then use the same method to factor 869 107. 


The protocols used are also important. If the same message is sent to more than one 
person using the same pq, the message g can be recovered without knowing ¢(pq) - at 
least if two of the public keys ¢, are relatively prime. This happens because we know g" and 
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g? (mod pq) and if t, and tf, are relatively prime we know that there are integers u, v such 
that t;u + t,v=1. But then g=(g")“(g)’ (mod pq). 


Exercise 4.1.9 Suppose that pq = 2183 and the public key for Spock is t; = 23 while that for 
Kirk is t} = 17. With m= pq, using our usual block encryption method from the preceding 
exercises, the encrypted message to Spock is g*? =a (mod m), while the message to Kirk is 
g'’ =b (mod m), find g. Here the blocks for a are 1635 2034 0061 2027 and the blocks for 
b are 0478 0961 1707 1562. Figure out the message without factoring pq. 


Kumanduri and Romero [62, pp. 139ff] note that if you could find the decrypting 
exponent s somehow without knowing ¢(pq) then you could factor n= pq. They use a 
probabilistic method that makes use of the fact that the congruence x” = 1 (mod pq) has 
two solutions more than the obvious two which are +1 (mod pq). If you have a solution x 
which is not +1 (mod pq), then you know that pq divides (x — 1)(x+ 1). This will allow 
you to factor n = pq by computing gcd(n, x + 1) or ged(n, x — 1). We will not go into the 
details of the use of the exponent s to produce a solution of 1? =1 (mod pq) but instead 
give an example or two. 


Example. How to Factor pq if you Know the Decrypting Exponent as Well as the Encrypting 
Exponent. Suppose that pq = 18 833, the encrypting exponent is t= 47, and the decrypt- 
ing exponent is s= 10895. Then we factor ts — 1 = 2%b, where b is odd. We find that a= 6 
and b= 8001. Next we pick a random number w= 63. We should note that we needed 
to try six other random w before we found a “good” one. Now we create a sequence: 
Xo = w? (mod pq), x; =2x7_, (mod pq), for i=1,...,a. We know that x,= 1 (mod pq). If 
we choose the first i such that x;,; = 1 (mod pq), then assuming that +x; is not congruent 
to +1 (mod pq), we can find our p and q. In our case x; = 63°! = 11915 (mod 18 833). 
Then x2 = xj = 4071 (mod 18 833). And x3 =x} =1 (mod 18833). Thus we should look at 
gced(4070, 18 833) =37 or gced(4072, 18 833) = 509. We have 18 833 = 37 * 509. Our algo- 
rithm fails when % =1 (mod pq) or x;=—1 (mod pq). Kumanduri and Romero compute 
the probability that this happens for one choice of w is Ps Thus the probability of failure 
for k random choices of w is 1/2*. In our example, we tried seven random values of w and 
2~’ =0.0078. A 


Exercise 4.1.10 Find the four solutions x (mod 23 « 37) of the simultaneous congruences: 
x’ =1 (mod 23), 
x? =1 (mod 37). 

Note that these values of x indeed solve x* = 1 (mod 23 x 37). 


Exercise 4.1.11 Imitate the example of factoring pq, assuming that you know both the 
encrypting and the decrypting exponents t,s, when pq =90743, t= 23, and s=70 247. 


Exercise 4.1.12 In our RSA cryptography, we choose our distinct large primes to be p and q, 

and then choose our encryption exponent to be t with gcd(t, b(pq)) = 1. Thus the decryption 

exponent s satisfies st= 1 (mod ¢(pq)). Show that t?(%?) = 1 (mod ¢(pq)) and thus that 
=t-! =19(9(P9))—-! (mod ¢(pq)). 


Exercise 4.1.13 Prove that if p and q are distinct primes, then the multiplicative group Z), 
is isomorphic to the direct product Z, © Z7. 
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4.2 Chemistry and the Finite Fourier Transform 


If you want to predict the behavior of a building in an earthquake, you need to know the 
fundamental frequencies of vibration of the building (or a finite element matrix approxi- 
mation to the building) - as well as the likely frequencies of vibration of the earth on which 
the building sits caused by an earthquake. If you want to know the chemical constituents 
of a star, you need to know the spectrum. As Neil DeGrasse Tyson [121, p. 148] says: “In 
short, were it not for our ability to analyze spectra, we would know next to nothing about 
what goes on in the universe.” Symmetry groups have much to say about spectra, as we 
shall see in this section. Fourier analysis helps to uncover spectra of symmetric objects. 
Here we will mostly discuss Fourier analysis on a finite cyclic group. This is sufficient for 
a molecule like benzene. 

Let G be a finite Abelian group of order n, with the group operation as addition. Let G 
denote the dual group of group homomorphisms . : G—>T, where T is the circle group of 
complex numbers of norm 1: that is, 


T={zeC | |z|=1}. 


Here |x+ iy? =x? + y’, for x,y €R. The group operation on T is multiplication. We often 
call such x a character of G. Then the dual group G is the set of all such characters x of 
G. What is the group operation on G? Suppose y,w €G. Then if ge G, define (yw) (g) = 


x(9)v(9)- 


Exercise 4.2.1 If i= ./—1 define e* =cosx + isinx, for xe R. Using properties of powers, 
show that T(x) =e", for x ER, defines a group homomorphism from the additive group of 
real numbers to the multiplicative group T of complex numbers of norm 1. Find the kernel 
K of T. Use the first isomorphism theorem from Section 3.5 to identify T with IR/K. 


Exercise 4.2.2 Show that the dual group G corresponding to a finite Abelian group G is 
indeed a group. 


Before defining the Fourier transform, we consider a “new” kind of multiplication of 
functions. Suppose we are given two functions fg mapping our group G into the complex 
numbers. We define the convolution of functions fand g to be f* g where: 


Geis a Lfe- b)g(b), foraeG. (4.1) 
beG 
We read fx g as f splat g. The operation is important for digital signal processing. 

The following exercises give the algebraic properties of splat and show that it mir- 
rors addition in the Abelian group G. Note that the function 59 has the property that 
f* 69 = (1/|G))f and thus |G| dp is the identity for the operation splat. It is possible to define 
convolution of (integrable) functions f: R + C, but you would need to replace the sum with 
an integral. However, the identity for splat does not exist as a function on the real line. 
This leads to the theory of distributions and the Dirac delta distribution; see Terras [118]. 
When functions on R are splatted together it tends to smooth out the corners. Convolution 
of our functions on a finite group like the additive group Z,, can mimic that behavior when 
they approximate functions on R; see Exercise 4.2.12. Convolution of a nasty function 
on R that has lots of comers with a nice smooth function on R gives a nice function. In 
probability and statistics convolution of probability densities corresponds to the addition 
of independent random variables. 
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Exercise 4.2.3 Show that convolution has the following properties for a finite Abelian group 
G under addition: 


(a) fxg=agxf; 
(b) fx (gt h) =fegt+fxh; 
(c) fx (g* h) = (fxg) *h. 


Exercise 4.2.4 Let a be an element of a finite Abelian group G under addition. Define the 
function 6,(x) =1 ifx=a and 6,(x) =0 otherwise. Let n=|G|. 


(a) Show that 5a * 5» = +6a40- 
(b) Suppose f: GC. Show that (f* 6,) (4) = *f(x- a), for all xe G. 

Next we define the Fourier transform on a finite Abelian group G. This Fourier transform 
can be used to simplify convolution equations and analyze spectra of Cayley graphs for 
G - among other things. We define the (finite) Fourier transform of a function 


f:G>C 
at the character y € G by: 
food =IGI7' $2 F0)x6). (4.2) 
yeG 


Since G is a finite Abelian group, there are no convergence problems. In fact, the theory 
of Fourier transforms on finite Abelian groups has many applications, since it is just what 
is needed for the fast Fourier transform or FFT - an idea which has decreased immensely 
the time needed to compute such transforms. For example, it is useful when one needs to 
multiply humongous integers. It was first found by Gauss in computing the orbit of the 
asteroid Juno in 1805. See Terras [116] for more information on finite and fast Fourier 
transforms. The continuous analog of the finite Fourier transform is a fundamental tool 
in applied mathematics. It will of course involve an integral and can be used to solve 
differential equations. See my book [118, Chapter 1] for that subject. The finite version is a 
much more easily computed animal. 

From now on assume that the group G is cyclic of order n. So we will identify G with 
Z/nZ under addition. Then we can show that the characters y € G can be identified with 
exponentials. Suppose a,x €Z/nZ. We will identify elements of Z/nZ and integers repre- 
senting the cosets when we plug them into functions on Z/nZ. Define ya(b) = e?7/", for 
a, b (mod n). 

The space of functions L?(G) = {f:Z/nZ— C} is a vector space over C of dimension 
equal to n. (See Section 7.1 if you do not remember the definition of dimension of a vector 
space.) Moreover L?(G) has an inner product given by 


(.9)= d> f(®)9G). (4.3) 


xEZ/nZ 


Exercise 4.2.5 

(a) Show that as a vector space L?(Z,) can be identified with C". In fact, show that a 
convenient basis consists of the functions 6a, a€ Zy, defined in Exercise 4.2.4. 

(b) Show that, under the operation of convolution, the space L?(Z,) has an identity element 
e such that ex f =f for all fE L?(Zn). 

(c) Is L?(Z,) a group under convolution? 
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The space L?(Z,) is a vector space with a product defined by convolution. As such it 
is called the group algebra of the group Z, under addition. Exercise 4.2.4 shows that 
the convolution of the functions 6, defined in that exercise corresponds to addition of 
the elements a in Z,. The theory of the structure of such algebras was developed by 
J. H. M. Wedderburn. Emmy Noether noticed in 1929 that this structure theory could be 
used to do Fourier analysis on groups. However I personally find that Noether’s methods 
make the subject less intuitive - especially for those of us introduced to classical Fourier 
analysis in that course which used the subject as a tool for the solution of partial differential 
equations. 

Linear algebra (see Section 7.1) tells us that any basis of a vector space has the same 
number of elements and that an inner product space has an orthonormal basis. The additive 
characters of Z, are orthogonal by the following lemma, which also says that to normalize 
them we must just multiply by n~ 1/2. 


Lemma 4.2.1 (Orthogonality of Characters on Z/nZ). Suppose a,x€Z,. With ya(b) = 
e2riab/n we have the following formula for inner products defined by (4.3) 


( = n if a=b (mod n), 
XaXb/= VQ otherwise. 


Proof. First what is (va, x»)? From the definitions, we obtain 


(xaxo)= D> xa(z)xs@)= D> xa-0(z). 


xEZ/nZ xEZ/nZ 


If n divides (a — b), then xa_»(x) =1 for all x€Z/nZ and the result is clear as we are 
summing n 1s. Otherwise, let c= a — b, which is not divisible by n. Call the sum on the 
right-hand side of the equality S. That is, 


S= S©> x-(z). 


xEZ/nZ 


Note that 


Xe(1)S=Xe(1) SD xe(z= SS xelx+ 1) =S. 


xEZ/nZ xEZ/nZ 


The last equality holds since x-+ ++ 1 is a 1-1 map of Z/nZ onto itself. But then S must 
be 0, as x,-(1)A 1 because c is not divisible by n. A 


Exercise 4.2.6 Show that if G=Z,, the dual group G of additive characters of G is 
isomorphic to G. 


The following proposition says something about why the Fourier transform is useful. 
It changes the complicated multiplication of convolution to the simpler one of pointwise 
product. Moreover, it is an invertible transformation. 


131 


132 


Part | Groups 


Proposition 4.2.1 (Some Properties of the Fourier Transform). Let the group G= Z, under 
addition. Use the preceding notation for the Fourier transform on G and convolution of 
functions on G. 


(1) Convolution 
Fax) =f) G00), for all xe G. 
(2) Inversion 


f() =S°FOdx(2), for all rE Ie. 


xEG 


Proof. 
(1) Note that 


fe g(x) =n? 3" S- f(z —y) gly) x@) 


where we set w=z — y and reversed the order of summation. 
(2) Observe that 


Soxefod = EYEx(a) SS F)xO) 


xEG xEG yeG 
=S°f() 4 So xe -y) =F), 
yeG xeG 


since x(y) = x(y) 


1 _ fi, x=y (mod n), 
n & x —y) = {0 otherwise. (4.4) 


~!= y(-y), and Lemma 4.2.1 states that: 


A 


Now we want to apply our transform to the study of benzene — C¢He. This is a molecule to 
be both admired and feared - admired for its stability, feared for its toxicity. See Figure 0.1 in 
the preface - a figure which is basically a Cayley graph of the group Z/6Z under addition 
in which the generating set is {+1 (mod 6)}. The adjacency matrix of this graph is the 
6 x 6 matrix 


0100041 
101 0 0 0 
a=? +o 24 0-0 
0010 1 0 
00010 1 
100 0 1 0 


This is an example of a circulant matrix - each row is a shift of the row above it: that is, 
the entries of a row are a cyclic permutation of the row above. 
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One can view the adjacency matrix as a matrix of the adjacency operator act- 
ing on complex-valued functions f(x), for x in the Cayley graph X(Z,,S), where 
S={+1 (mod 6)}. The action of the adjacency operator is given by 


Af (x)=) “f(z+s) =n (5s *f) (x), (4.5) 
ses 
where ds denotes the function that is 1 on S and O elsewhere on G. As in equation (4.1), 
* denotes convolution of functions on G. The basis of L?(Zg) that gives rise to the adja- 
cency matrix A is the set ;;;, j= 1,2,...,6, where 6,(x) = 1 if r=j (mod 6) and 4;(x) =0 
otherwise. 


Exercise 4.2.7 Give the details needed to prove the claim made in the preceding paragraph. 


The spectral theorem from linear algebra says that L?(G) has an orthogonal basis of 
eigenfunctions of A. In fact we know these eigenfunctions in our special case. They are the 
characters of the additive group Z/6Z. To see this, just note that 


Axo(x) = xo(4 + 1) + x0(4— 1) = (x0(1) + x0(-1)) xo (2). 


Thus the eigenvalues of the adjacency matrix A are A»=x0(1) + x»(—1) = 2 cos(27b/6), 
b=0, 1, 2,3, 4,5. This is a case for which the finite Fourier transform is easier to use than 
Matlab, Mathematica, Scientific Workplace or whatever is your favorite program. So the 
spectrum - spec(A) for short - or the set of eigenvalues of the adjacency matrix coming 
from benzene is: 


spec(A) = {2 cos(0), 2 cos(7/3), 2 cos(27/3), 2 cos(m), cos(4/3), cos(57/3) } 


4.6 
ae a re en ee) ee) 


In order to think about the stability of benzene, one considers a system of vibrating 
springs arranged in a hexagon. The solutions can be expressed as linear combina- 
tions of states corresponding to the orthogonal eigenfunctions; see Terras [116, p. 214]. 
Exercise 4.3.8 involves a simpler system of springs. 

In 1932 the chemist Htickel argued that the stability is governed by the rest mass energy 
defined by summing the largest half of the spectrum of A in (4.6): 

BS i 
A€spec (A) 
For benzene we get FE = 5(2 +1+ 1) 1.3333. If you compare this with the value of E for 
a four-vertex ring like cyclobutadiene, you see that it is indeed larger. We leave that as an 
exercise. A good reference for Htickel theory is Starzak [112]. 


Exercise 4.2.8 Compute the spectrum of the adjacency matrix of the Cayley graph 
X (Ze, {1,3,5 (mod 6) }). 


Exercise 4.2.9 Compute the rest mass energy for a molecule corresponding to the Cayley 
graph of Z/4Z with generating set {+1 (mod 4)}. 


It is possible to compute the rest mass energy for more complicated molecules such as 
buckminsterfullerene Cy. Here the Cayley graph is the soccerball (the old version) also 
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known as the truncated icocahedron. Since the group involved is not Abelian, more com- 
plicated non-Abelian Fourier analysis is needed to see that the rest mass energy is 1.5. See 
Terras [116]. 


Exercise 4.2.10 If G=Z/nZ, under addition, suppose the adjacency operator on the Cayley 
graph X(G, S) is defined by (4.2) and the Fourier transform is defined by (4.5). Show that 
for any function h(x), «€G, the Fourier transform diagonalizes the adjacency operator, 


meaning that if we write h=Fh, then (FAF-th) (x) =F 55(x) h(x), for all ye G. If 


G={ui, ..+,Xn}, use the basis of L? (@), given by the set of functions 5,,,}, j= 1,2,...,N, 
where 6;,,; denotes the function that is 1 on the set {x;} and 0 otherwise. Then show that 
the matrix you get for FAF—' is diagonal, with jth diagonal entry (F6s)(x;). 


Exercise 4.2.11 (The Plancherel Theorem). Show that if we define the norm of a func- 

tion fe L? (Z,) by FS - S> [f(x)|’ and we make the same definition of the norm of a 
xEZn 

function F:Z, —+C, then 


Siler 


Iflb = 


Exercise 4.2.12 Consider Z1o1 > R defined by 


1, x=+1, +2, +3,+4,+5 (mod 101), 
f(x) = 
0, otherwise. 


9 


Graph f as a function on the real line for x€ |—50,50] MZ. Then graph fx f and compare. 
Note that the function f is nonzero on {+1, +2, +3, +4, +5 (mod 101)}. This set is called 
the support of f. What is the support of f* f? 


The word “spectrum” is associated to spectroscopy - the study of the spectral properties 
of atoms and molecules. Spectroscopy will even allow the analysis of the elements in stars. 
For example, spectroscopy led to the discovery of helium in the spectrum of the sun before 
it was discovered on earth. See Neil DeGrasse Tyson [121, Chapter 15] for more of the 
fascinating story of spectroscopy and its use in cosmology. The subject really requires 
quantum mechanics, chemistry, and group representations (i.e., Fourier analysis on non- 
Abelian groups). I say a little more about the subject in [116]. 

There are many kinds of spectroscopy from shining a light on a prism to shining 
X-rays on a sample from calf thymus. The type known as X-ray diffraction spectroscopy 
was important for the discovery around 1953 of the structure of DNA as a double helix 
by Francis Crick and James Watson. That started with Photo 51, which can be found in 
a paper of Rosalind Franklin and Raymond Gosling. The photo shows a big X. See the 
Wikipedia article titled Photo 51. Sadly Rosalind Franklin died of ovarian cancer very soon 
after this work was done. One suspects the X-rays for that. The Wikipedia article on her 
life is a good start for trying to understand the history of this figure. The symmetry is quite 
visible in Photo 51 but the helix takes a leap of imagination - as well as a knowledge of 
the interpretation of such images in crystallography. 


Applications and More Examples 


4.3 Groups and Conservation Laws in Physics 


In this section algebra meets calculus to do a beautiful dance. To understand how invari- 
ance of a Lagrangian for a physical system under a group such as the group 0(3) of all 
3 x 3 rotation matrices can lead to a conservation law such as the conservation of angu- 
lar momentum one must know a bit about the calculus of variations. This is a subject 
that sadly seems to have disappeared from the undergraduate mathematics curriculum. 
However, many books for applied mathematics students contain a chapter devoted to vari- 
ational calculus: for example, Courant and Hilbert [18], Cushing [21], and Greenberg [37]. 
Mathematica has a package called VariationalMethods which will solve calculus of vari- 
ation problems. There are also books wholly devoted to the subject such as Gelfand and 
Fomin [34] and Smith [109]. This section will require a bit of calculus of the more advanced 
sort. My favorite advanced calculus reference is Lang [63]. 

It was Emmy Noether who showed that when groups of transformations leave the 
Lagrangian of a physical system invariant, there is a corresponding conservation law. This 
theorem of Emmy Noether was published in 1918. It led physicists L. M. Lederman and 
C. T. Hill to write (in [68]) that Noether’s theorem is “certainly one of the most important 
mathematical theorems ever proved in guiding the development of modern physics.” 

Emmy Noether lived from 1882 to 1935 and was the creator of much of this algebra 
course. Sadly she lived in a time when women were not allowed to be professors in most 
universities and Hitler was forcing anyone who was Jewish to flee Europe or face death in a 
concentration camp. She died after only a few years in the US teaching at Bryn Mawr college 
and lecturing at the Institute for Advanced Study in Princeton, New Jersey. See Kramer [59, 
Chapter 28] for a short biography of Noether written by the mathematician Hermann Weyl. 
There is also an interesting discussion of the Institute for Advanced Study - which hosted 
many refugees from the Nazis such as Albert Einstein and Kurt Godel - in Bardawil et al. [7]. 

An example of a calculus of variations problem is that of finding the curve connecting 
two points A and B in 3-space that minimizes distance. Such a curve is called a geodesic. 
This is the problem of finding a curve p(t) = (x(2), y(t), z(t)), for te [a,b], such that p(a) = 
A, p(b) =B and the following integral is minimized: 


b 
L[y| = | EP + y(t)? + z(t)? dt. 


You may feel that you already know the answer to this problem. If you replace the Euclidean 
distance with some other distance such as that on the surface of a sphere, then you may 
find the problem more interesting. 

The simplest result in the calculus of variations goes as follows. Euler found it in 1744. 
Suppose that F(x, y,z) is a nice function (e.g., possesses continuous second partials with 
respect to all variables). Consider the functional 


b 


y= [Fay dx 


a 


on the domain 


D= {y continuously differentiable such that y(a) =A and y(b) = B}. (4.7) 
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Here “functional” just means a function whose variables are themselves functions. Then, if 
Jiy| is a maximum or minimum, we call y an extremal and it follows that y must satisfy 
the Euler-Lagrange equation: 


OF  d OF _ 


OE chy (4.8) 
Oy dtdy’ 


This is proved by a first derivative test argument. 


Proof. A sketch of the proof which does a bit of cheating goes as follows. Let fe D as 
defined by (4.7). We want to create a “straight line” in this function space. To do so, fix a 
continuously differentiable function h on [a, b] such that h(a) = h(b) =0. Then for tER, 
form the function (y + th) (x). This new function of x satisfies the same boundary conditions 
as fand thus lies in our domain D. Thus if J[y] is a minimum over the y in D, the new function 
j(t) =Jly + th] has a minimum at t=0. Then we can apply the ordinary first derivative test 
from calculus. We assume here that we can differentiate under the integral sign. And we 
use the chain rule for functions of several variables. This gives: 


b 

; dJly + th dF (x,y + th,y’ + th’ 

0=;'(0) = is =i (4,9 + th, y ) 
t=0 


dx 


t=0 


dt 


b 


, ! 
=| OF (x,y, Viet OF (x,y,y dy dx. 
Oy Oy’ 


a 


Then use integration by parts on the second term inside the {} to see that 


b 

! ! ! b 

o= / OF (HY y')  d OFHYY) yg OF IY) | 
Oy dx Oy’ Oy! 


la 


Since h(a) = h(b) =0, it follows that the boundary term vanishes. Thus 


b 


, ! 
o- f (AGH) ge | hae 


Oy dx Oy’ 


for all continuously differentiable functions h on [a,b] such that h(a) =h(b)=0. This 
implies that the term inside {} is identically zero, otherwise (using the continuity of the 
function in {}) one could create a function h to produce a contradiction. And that is indeed 
the Euler-Lagrange equation. If you want a totally air-tight argument, see the references. A 


For the example of geodesics in 3-space, we need to consider the problem with more than 
one dependent variable. We ask for the curve (yi(x), y2(x),y3(x)), a<a <b, maximizing or 
minimizing the functional 


b 
Jon. sayal= [Fy 299) dx 


a 
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such that (yi(a@), y2(a), y3(a)) =A and (1 (b), y2(b), v3 (b)) =B. Then one can use a similar 
argument to that just given to see that y=(y,,¥.,¥3) must satisfy a simultaneous system 
of Euler-Lagrange equations: 


OF d OF | 


dy, ~ atoyy 
OF d OF 
oy, Og” om 


OF a OF 
Oy, dt Oy 


Exercise 4.3.1 Obtain the system of equations (4.9) by a similar argument to the one that 
worked for one dependent variable. 


Now consider the problem of geodesics. What are the Euler-Lagrange equations? Note 
that F= \/x’(f)? + y’(f)? + z’(f)? is independent of x,y,z. So the result is that x” = y= 
z" =0. It follows that the tangent vector (x’,y’, z’) is a constant vector. Thus the curve must 
be a straight line. Well, we knew that. Moreover, we still must prove that the straight line 
gives a minimum. We will make this an exercise after discussing the invariance of a func- 
tional under a transformation of the coordinates. This is directly related to the application 
of group theory found by Emmy Noether. 

But before doing that, let us just note why this is useful for physicists. It is often possible 
to state the laws of physics as minimum principles for integrals on spaces of functions. For 
example, one can derive Newton’s equations of motion from the principle of least action. 
What is action? Suppose we have a particle of mass m and position (x(t), y(t), x(t)) at time 
t acted on by a force F such as gravity. Then we make the following definitions. 


Definition 4.3.1 The kinetic energy of the particle is T= a(x) +(y/)? + (2')”). 


Definition 4.3.2 The potential energy of the particle is U, where the force F= —grad U= 


_(au_ au au 
Ox? Oy? Oz }* 


Definition 4.3.3 The Lagrangian is L=T — U. 


h 


Definition 4.3.4 The action A =f dt. 


t 


The principle of least action says that the particle will move so as to minimize action. The 
Euler-Lagrange equations say that this implies Newton’s law - force = mass x acceleration. 
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In the situation of several dependent variables we get three Euler-Lagrange equations, one 
for each variable: 


O dO 
ag U) aon Oe 


a d 0 
ma U) fo U)= 
a d a 
yp gage 


The end result is that F= m(x", y”, z’) = ma, where a is acceleration. 
Exercise 4.3.2 Prove the last statement. 


Many of the favorite differential equations of mathematical physics can be derived from 
variational principles, such as Maxwell’s equations, the Schrodinger equation. This makes 
the properties somewhat easier to understand I think - especially when one knows the 
following theorem of Emmy Noether. 


Before stating her theorem, we need a definition. We say that a functional 
b 


t= [Rey dx is invariant under a transformation 2*=9®(x,y,y’), y*= 
a 
W(x, y, 9’) if 


b b 
[eong) = frleoa) © 


a 
As an example, consider any functional which is independent of +x, that is, J[y] = 
fr F(y, y’)dx. Then if we consider the translation or shift transformation 1* = x + t, for some 


real t, we see that to shift the curve you get y*(2*) = y(2* — t). Then using the formula for 
substitution in a definite integral: 


sori=ai= fe(or 2) ar = f(y.) an 


We have already considered the three-dimensional version of the following exercise, but 
we did not prove that straight lines minimize distance. Thus it may be useful to do this 
exercise to obtain a simpler approach that works in other situations. For example, similar 
arguments can be used to see that great circles minimize distance on the sphere and to find 
geodesics in the non-Euclidean upper half plane. See Terras [118, pp. 111 and 151]. 


Exercise 4.3.3 Consider a curve p(t) = (x(t), 


y(t)) in the plane, for a<t<b, connecting the 
points A= (x(a),y(a)) and B= (x(b), y(b)). The 


length of this curve is 


b 
sp|= f vai +y'(6? ae 
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Show that the minimum is achieved when the curve is a straight line. Thus straight lines 
are geodesics - no surprise. 


Hint. It is not hard to see that the arc length is invariant under translation of the 
point A to the origin and rotation of the point B to the y-axis. Then it is clear that 
Jeo) + (y/(t))? >|y'(0)|. The segment of the curve on the y-axis clearly minimizes 
the functional. 


Next we can finally state Emmy Noether’s theorem. 


Theorem 4.3.1 (Emmy Noether's Theorem). Suppose that the functional 
b 


Jpl= | Fxy.y')ax 


a 


is invariant under the one-parameter family of transformations defined for all values of 
the real parameter a by 


= D(x, 9,5 a), y= V(x, 9,95 a) 
such that x° =x and y° = y, for any choice of a and b. Then setting 


0® (x,y, 9/5 a) | OW (x,y, 9/5 a) | 
Oa Oa 


oa, y')= and (x,y, y’) = 


’ 


ber Pee 


we have 
OF , OF = 
Dy? pt (F- y =| o=constant 


along each curve y=y(x) such that a —o Dy = O- We are of course assuming that the 


functions ®(x, y, y'; a) and V(x,y, y';a) are continuously differentiable. 


Proof. To sketch the proof of Noether’s theorem, proceed as follows. First we parameterize 
our curves: (x(t), y°(t)), with the parameter t between O and 1. Then we can rewrite our 
functional using ’ to mean d/dt, 


G(xt.y)=¥F(x9.%) 
x 


to obtain 
1 


w= [otey2.yde 


(0) 


Since J |x, y] = constant, for all a € R, it follows (using integration by parts again) that 
1 
9 - Ve) -{{(F d OG 6+ (S- £6 mien 
da |g o Ox = dt Ox Oy dt Oy' 
0 


t=0 
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Euler-Lagrange says 


This implies the theorem using the definition of G and the fact that a and b are arbitrary. A 
Exercise 4.3.4 Fill in the details in the preceding argument. 


It is possible to extend Noether’s theorem to the case of more than one dependent variable. 
Suppose we have a functional 


b 
Fs d= PFI Vso o3h) de 


a 
and suppose that J is invariant under the one-parameter family of transformations: 
FSO CE Nips ia Say Via Oo) 
OSV Migssas ay mae Oe). 1 latin 


such that x° = x and y° = y. Then Noether’s theorem says that along each extremal y,..., V, 
for J[yi,..-,¥n], we have 


“. OF ~~ , OF 
pit (- y, o = constant, (4.10) 
Oy; 2, Oy; 


where 


OD (GE Vigiscs Va Disseaa te) 
DBI Irene dhpen) = OPI erMi ra a) 


Oa —_ 
and (4.11) 
OW; (1,91, - 0-5 Var Vip ee Vo 
WDM ier Pa Vivesie Ve) = i( a Jn Yh a o) 
Oa ly 


Exercise 4.3.5 Givea “proof” of Noether’s theorem for functions of more than one dependent 
variable using the same argument we gave for the case of one dependent variable. 


It is also possible to extend all of our theorems in this section, including Noether’s theo- 
rem, to the case of more than one independent variable. Then the Euler-Lagrange equations 
become partial differential equations. 


Examples 


1. Consider a system of particles as in Definitions 4.3.1-4.3.4. Suppose that the system is 
conservative: that is, U does not depend on time t¢ explicitly. Then Noether’s theorem 
implies that the total energy T+ Uis constant throughout the motion. 

2. Similarly, if the action integral is invariant under the group of translations of the 
x,y,z variables, then Noether’s theorem says that the total momentum m(v’, y’, z’) is 
conserved. 

3. If the action integral is invariant under rotations, then angular momentum is constant. 


Applications and More Examples 


4. The functional associated with the electromagnetic field and Maxwell’s equations is 
invariant under many transformations of 4-space and this leads to 15 conservation 
laws. A 


Let us give a few details for Example 1. The functional to be minimized is the action 


t, 
A= [tae where L =T — U. 


th 


Here, for a system of n particles at position (xi, yi, zi) at time t, the kinetic energy is 
1 n 
2 2 2 
re (x) + OY)? + (2”). 


The potential energy is 


U= Ui Lay, =o Any V1, coe Vn 21, ane Bil 
so that the force acting on the ith particle is 


OU OU OU 
Oa Oy an 


In Example 1, we are assuming that U does not depend on time ft explicitly. That 
is, our independent variable is t and our shift transformation function ® is 1t*= 
® (t,x, y,z,x,y’,z'/;a) =t+ a, for all a€R. Our © function is the identity, that is, x* = 
Xi, Vj; = Vi, Z; = Z;. It follows that the differentiated functions from (4.11) with respect to a 
are 6 =1 and 7; =O. Then Noether’s theorem implies that 


n n 
” (Ly -0+Ly -O0+Ly -0) + (1-3 it +yLy + AL) - ') 


i=1 i=1 
is constant along extremals of the action. It follows that the total energy T+ U is constant. 


Exercise 4.3.6 Check the last statement. 


Maybe you have been wondering: where are the groups in all of this? The answers for 
the examples go as follows. 


(1) The group of translations of time is R, the group of real numbers under addition. 

(2) The group of translations of vectors in R? is the additive group of vectors in 3-space. 
Of course this is not a one-parameter group. To say that a group G is a one-parameter 
group means that you can express any group element in the form g = g(t), t € R, where 
g(t) is a nice function: for example, continuous, differentiable. But you can certainly 
build up R? out of many one-parameter subgroups. 

The group is the rotation group O(3) = {3 x 3 real matrices U such that 'U=U-'}, 
where 'U denotes the transpose of U. That is, if U= (uj), then TU= (uj) . The group 
operation is matrix multiplication defined as in formula (1.3) of Section 1.8 and the 
sentence following it. Again the group is not a one-parameter group. You can express 
it using three parameters (the Euler angles). 


(3 


— 
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(4) The group is the Lorentz group O(3, 1) consisting of 4 x 4 real matrices A such that 
TAJA = J, where J denotes the matrix 


100 0 
0 1 0 0 
0 01 0 
0 00 -1 


The group operation is matrix multiplication. This is discussed in the book of Gelfand 
and Fomin [34, p. 184 ff]. The Lorentz group is important for Einstein’s theory of special 
relativity. 


Exercise 4.3.7 Show that the last two examples O(3) and O(3, 1) are groups. 


Hint. Recall that '(AB)= 'B 'A and '(A7')= (TA)? . It follows that we can write 
H(Aa*) = (TA) — Tg-1. 


The groups O(3) and O(1,3) are Lie groups - named for Sophus Lie. Such groups are 
much studied and are of great interest for physics and chemistry. The Lorentz group is 
fundamental for the theory of special relativity. 


Exercise 4.3.8 Consider the vibrating system consisting of two masses connected by springs 
to each other and two walls as in Figure 4.1. Assume the two bodies each have mass m and 
that the three springs each have stiffness constant k. Then the laws of physics tell us that 
the kinetic and potential energies of the system are 


T= sme") + sm (9x"") and U= whe + k(t yy? + wb. 


Use the principle of least action to derive the system of differential equations that rule 
the motion of the system. 


Figure 4.1 Vibrating system of two masses 


k m m k 
WW is dat V\ 7 \ 
x(0 0) 


I discuss a slightly more general vibrating system than that of the preceding problem 
in Terras [116, p. 215ff]. This system leads to an eigenvalue problem that helps to explain 
the importance of spectra for the understanding of vibrating systems. You can extend the 
theory to a system of vibrating masses arranged in a ring of six masses say. This gives a 
model for understanding benzene. See Starzak [112, Chapter 5] for more examples. 


4.4 Puzzles 


Next we want to consider an application of group theory to a puzzle - sudoku. There are 
many other puzzles one could choose, such as Rubik’s cube (see Joyner [49]). However, it 
seems preferable to keep the puzzles as easy as possible. Even restricting our sudokus to 
smaller grids than are interesting to sudoku experts, we come face to face with some fairly 
big groups. 


Applications and More Examples 


A good reference is Jason Rosenhouse and Laura Taalman [94]. Another is Crystal Lorch 
and John Lorch [71]. 

A classic sudoku puzzle is a9 x 9 grid in which a number of clues are entered. The object 
is to fill in the rest of the grid so that each row has the numbers from 1 to 9 in some order, 
ditto for each column, ditto for each of the nine 3 x 3 blocks into which the grid is divided 
by two equally spaced vertical lines and two equally spaced horizontal lines. For the puzzle 
to be valid it must have one and only one solution. Often the puzzles are made to have 
some symmetry as well. 

Here we will simplify our group theory by looking at shidoku or junior sudoku, which 
involves a 4x 4 grid to be filled in with the numbers from 1 to 4. An example from 
Rosenhouse and Taalman [94, p. 160] is 


2 || 3 
3 1 


The four blocks are outlined with double lines. This is a puzzle that has the maximal number 
of clues for a not totally trivial puzzle with a unique solution. The following example 
has the minimal number of clues in some sense, again from Rosenhouse and Taalman 
[94, p. 167] 


We leave it to the reader to solve these puzzles. 
Exercise 4.4.1 Solve the preceding two shidoku puzzles. 


There are many questions one could ask. How many puzzles are there? What is the group 
acting on puzzles? How many orbits does this group have in the set of shidoku puzzles? 
Burnside’s formula should help us once the group is identified. However brute force will 
also work. But we are impelled to take the group approach. 

We shall say that a shidoku grid is a 4 x 4 grid, divided into 2 x 2 blocks, that has been 
completely filled in with the numbers 1, 2, 3, 4 so that each row, column, and block contains 
all of the numbers 1,2, 3,4. A shidoku puzzle is a grid with missing entries. What is the 
group acting on the shidoku grids? Well, clearly S, acts by permuting (or relabeling) the 
entries in the grid. Then there is the group K generated by the following operations: 


1) switch rows 1 and 2 or rows 3 and 4; 

2) switch columns 1 and 2 or columns 3 and 4; 

3) switch the top two blocks with the bottom two blocks; 

4) switch the right two blocks with the left two blocks; 

5) rotate the grid counterclockwise by 90°; 

6) transpose the grid (as a 4 x 4 matrix) - meaning interchange column i with row i, 
i=1,2,3,4. 


=a" oO oO eroornr me 
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Exercise 4.4.2 Show that it is not legal to switch rows 1 and 3. 


The shidoku group G is generated by these six operations on the grids plus S,. To count 
the number of essentially different shidokus is to count the number of orbits of G on the 
shidoku grids. Burnside’s lemma is set up to do this. The group G is the direct product of S, 
with K. The group G has order 3072 = 24 x 128. It is considered in great detail in the paper 
of Elizabeth Arnold, Rebecca Field, Stephen Lucas and Laura Taalman [1]. To use Burnside’s 
lemma, one wants a smaller group. 


Exercise 4.4.3 

(a) Show that the group K generated by the six operations listed above has order 128. 

(b) Since the action of S, commutes with the action of K, show that the shidoku group G 
is isomorphic to the direct product of S, with K. 


Before attempting to use Burnside’s lemma, let us first use brute force to find the number 
of inequivalent shidoku grids under the action of the shidoku group G. It is not really so 
hard to see that there are two orbits. Use the permutations of entries to see that we can take 
the upper left-hand block to be that in the grid below: 


1 | 2 
3 


Then by permutation we can complete the first row and column as follows: 


2/3 | 4 


1 
3 
2 
4 


Now there are essentially three things to try: 


(4.12) 


Bl D]}} W]e 
Wl rR] BR] db 
— 
Bl D]}} W]e 
me} LT} BB} bd 
i) 


Each of these grids is equivalent to 96 = 24 - 2- 2 others under the action of the group G. 
Thus there are 288 =3 - 96 total shidoku grids. However, there is still an equivalence. 


Exercise 4.4.4 Show that it is not possible to complete the following puzzle to a legal shidoku 
grid: 


Applications and More Examples 


Note that the second shidoku grid in (4.12) can be transformed to the third by a sequence 
of elements of G. Transpose the grid then permute the numbers 2 and 3. It follows that 
there are exactly two orbits of G on the shidoku grids. One has 96 elements and the other 
has 192 elements. 

Rather than applying Burnside’s lemma to the rather large group G, Arnold et al. [1] 
instead apply it to the subgroup G, = (s, t) x S, of H xS,, where s denotes the operation 
of switching rows 3 and 4 of the shidoku grid, t denotes transpose of the grid, and (s, t) 
denotes the group generated by s and t. This group has order 192, which is the minimal 
order of a group which partitions the shidoku grids into the same number of orbits as G. 
Arnold et al. [1] then use the Burnside lemma to see that the number of orbits of G, is 
indeed 2. They note that it is possible to visualize what is happening by creating a graph 
in which each vertex represents a shidoku grid. Then an edge corresponds to generators 
of the groups acting on the grid. For G, they take generators s, t, (12), (23), (34), (14). See 
Figure 6 of the paper [1]. 

We consider this Burnside’s lemma calculation. First one needs to find the conjugacy 
classes for the group H generated by s, the operation of switching rows 3 and 4 of the 
shidoku grid, and #, the transpose of the grid. One finds that H is 


H=(s, t) = {e,s,t, st, ts, sts, tst, stst}, where e is the identity. (4.13) 


The defining relations for H are s* = t? =e and (st)* =e. The group H has conjugacy classes 
{g} ={ xgx~'| x € G} represented by e, s, t, st, (st)”. 


Exercise 4.4.5 


(a) Prove equation (4.13) and show that (st)* =e. 

(b) Since the group H is small, you should be able to identify it - especially after the next 
section. The problem is that you need different generators. Try x= st and t. Then you 
will see the relations are t? = e and x* =e, plus xtx = t. Thus you should be able to show 
that H is Dy. 


Exercise 4.4.6 Show that the conjugacy classes in the group H=(s,t) defined in (4.13) 
are represented by e, s, t, st, (st)*, where s denotes the operation of switching rows 3 and 4 
of the shidoku grid, and t denotes transpose of the grid. Find the order of each conjugacy 
class. You may find it easier to compute everything using the preceding exercise to realize 
A as Dy. 


With the generators s, t, the group H is known as a Coxeter group. Many of our favorite 
groups turn out to be Coxeter groups: for example, all the dihedral groups, the symmetric 
groups, groups of reflections in Euclidean space or hyperbolic space. An example of the 
last group is PSL(2, Z) = SL(2,Z)/Z, where SL(2, Z) is the modular group of 2 x 2 integer 
matrices of determinant 1 and Z denotes the center of SL(2, Z). A Coxeter group is defined 
to have generators a1, ..., dn and relations (aja;)"" = e, the identity. Here mj € Z* U {oo}, 
m= 1 and mj > 2, for iA j. Thus all generators have order 2. If mj = 00, there is no relation 
on a,a;. A reference for the subject is Bjorner and Brenti [10]. 


Exercise 4.4.7 Given a Coxeter group, as described in the preceding paragraph, show that 


if for all iA j we have mj = 2, then aa; = aaj. 


145 


146 


Part | Groups 


Exercise 4.4.8 Show that the symmetric group S, is a Coxeter group. 
Hint. Use the transpositions (a a-+ 1) to generate the symmetric group. 


To use Burnside’s lemma, one must do a count. For each conjugacy class of H, one must 
count the invariant shidoku grids up to permutation of the entries. For example, consider 
the row swap s. The claim is that there is no shidoku grid for which a relabel will undo s. 
On the other hand if we consider ft, we see that the following grid is invariant up to the 
permutation by (23). 


1] 2 4 | 3] 1/3] 4/2 

3 [4] 2/1] = 2/41/1314 

4/3] 1 2 | 4/2] 1/3 
(23) 

2/1 3] 4] a 3/1 ]/2/4 


One finds that the number of grids invariant up to permutation of the entries under 
the conjugacy class of t is 2- 4!. This and the conjugacy class of the identity are the only 
conjugacy classes leaving grids invariant up to entry permutation. It follows that Burnside’s 
lemma says that the number of orbits of the group G, = (s,t) x Sq acting on the grids is 


1(12- 4!) + 2(0) + 2(2-4!) + 2(0) +10) | 


aval (4.14) 


Exercise 4.4.9 Show that conjugate elements of H correspond to the same number of 
G-invariant shidoku grids. 


Exercise 4.4.10 Check the computation in equation (4.14). 


Since the number of G-inequivalent shidoku grids is only 2, you might worry that news- 
papers will run out of the classic sudoku 9 x 9 grids. However, Jarvis and Russell [48] found 
that the number of orbits of the larger group acting on 9 x 9 sudoku grids is 5 472 730 538 
using Burnside’s lemma and lots of computer time. So far, there is no simple way to find 
the number of orbits for classic sudoku. Of course, there are many puzzles corresponding 
to a single grid. McGuire et al. [76] show that 17 clues are necessary to produce a puzzle 
with a unique solution for classic sudoku. Gordon Royle’s website has at least 50 000 clas- 
sic sudoku puzzles with 17 clues. We ask for analogs for shidoku in the next problem (see 
Herzberg and Murty [43]). 


Exercise 4.4.11 What is the minimum number of clues necessary for a shidoku grid to have 
a unique solution? 


4.5 Small Groups 


We seek to list the groups of small order up to isomorphism. Of course the program Group 
Explorer does this for us. But we want to convince ourselves that the list is complete. 
Unfortunately, we may not have the patience to give every detail of the proofs at this 
point. Moreover we will want to use the Sylow theorems which we stated in Section 3.7 - 
evilly without proof. 


Applications and More Examples 


For orders p which are prime, we know by Corollary 3.3.1 of Lagrange’s theorem that 
the group is cyclic C, and that is all the possibilities for group orders 2, 3, 5, 7, 11, 13. 
Of course, there is only one group of order 1 as well. 

It helps to know the fundamental theorem of finitely generated Abelian groups, which 
implies that a finite Abelian group is a direct product of cyclic groups. We will assume 
this theorem in this section. It has a very nice proof involving the analog of Gaussian 
elimination for matrices with integer entries. See Section 7.1 for a sketch of a proof. 

We have seen in Exercise 2.4.14 that there are two groups of order 4: the cyclic group 
C, and the Klein 4-group C4 @ C4. 

For order 6, there are two possibilities. If the group G of order 6 is not cyclic, it can 
only have elements of orders 1, 2, and 3 by Lagrange’s theorem. Cauchy’s theorem implies 
that there must be elements of orders 2 and 3. If G were Abelian, G would be cyclic by 
Exercise 3.6.17. Thus our group cannot be Abelian. 

We know that G must have an element h of order 3. Then H= (h) is a normal subgroup 
of G, by Exercise 3.3.7. So G/H is cyclic of order 2. It follows that G= {e, h, h”,g, gh, gh’}, 
for some g € G — H. One can show that g* =e and then that G is isomorphic to S;. To do 
this, use the fact that g? € (h) = {e,h, h?} and note that if g?=h or h~!, then g would 
have order 6 and G would be cyclic. Thus g*=e. Then note that ghg€ {h,h’} since 
the order of ghg is 3. But ghg=h implies gh=hg and the group is Abelian. Therefore 
ghg=h’. 


Exercise 4.5.1 Complete the proof that any non-cyclic group of order 6 is isomorphic to S3. 


Hint. You know enough to do the multiplication table for G. 


If G is an Abelian group of order 8, there are as many possibilities as ways to write 8 as 
a product of positive integers. We have 8, 4-2, 2-2-2. Thus we have the Abelian groups 
C,400,428008G. 

If Gis a non-Abelian group of order 8, it turns out that there are two possibilities: the 
dihedral group D, and the quaternion group Q. To see this, you need to think about the 
possibilities for orders of elements. They cannot all have order 1 or 2 as then the group 
would be Abelian by Exercise 2.1.10. So there must be an element h of order 4. Then H = (h) 
is a normal subgroup of G, by Exercise 3.3.7. So G is the disjoint union of H and gH, for 
some g €G. Moreover g’ € H. 

Case 1. ghg-! =h? and g? =e. 

Case 2. ghg-!=h? and g’ = Nh’. 

It is a challenge to show that these are the only cases left to consider when G is a non- 
Abelian group of order 8. Then one must show that in the first case we get the dihedral 
group D, and in the second the quaternion group. 


Exercise 4.5.2 Explain the following statements arising from the preceding discussion of a 
non-Abelian group G of order 8, with element h of order 4. 


(a) We must have ghg~' =h’. 
(b) It is not possible that there are two more cases with either g? =h or g =h?. 


Exercise 4.5.3 Explain why the groups C;,C, 6 C,, CG, 8 C, 6 CG, are not isomorphic. 
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Exercise 4.5.4 Prove that there are only two possibilities for a group of order 9: the cyclic 
group Cy and the direct product C, ® C,. In fact this generalizes to groups of order p?, where 
p is a prime. 


Hint. By Lagrange’s theorem we are reduced to the case that all non-identity elements a 
of G have order 3. Then you can show that (a) is a normal subgroup. To see this, obtain a 
contradiction by assuming that J b € G such that (a)/=b (a) b—'. Then b—' is in some coset 
of b(a) b—! which will imply b € (a). Moreover G/ (a) has order 3. Use Exercise 3.6.12. 


For order 10, there are again two possibilities: C9 and Ds. 

For order 12 there are five possibilities: Cj2,C; ®@ Ce, De, As, the semi-direct product 
C; x C4 of C; and CG. This last group C3 x C4 is a non-commutative group with generators 
a, b satisfying the relations a* = b? = e and bab =a. On the other hand, the dihedral group 
Dg has as generators a, b satisfying the relations a = b° = e and bab=a™!. 


Exercise 4.5.5 Show that there are only two groups of order 10. Feel free to use the Sylow 
theorems (particularly the third). 


Exercise 4.5.6 Prove that the following table has all the groups of orders 14 and 15. Again 
feel free to use the Sylow theorems (particularly the third). 


Finally we list the results of our thoughts about small groups in Table 4.1. 


Table 4.1 Representative non-isomorphic groups of orders <15 


order 1 C1 

order 2 OQ 

order 3 C; 

order 4 Cy QeGg 

order 5 Cs 

order 6 Cs S3 

order 7 CG 

order 8 Cs eG GO6Oa@0C Da Q 
order 9 Co C3 BG 

order 10 Cio Ds 

order 11 Cu 

order 12 Cia 6 Ce De Aa C3 x C4 
order 13 C3 

order 14 Cha Dy 

order 15 Cis 


The only new group on our list is the semi-direct product C, x C,. You can view this 
group as the points (x, y) with x € Z; under addition and y€ Z, under addition, where the 
group operation x is the following for x, u€ Z3 and y,v EZ, 


(x,y) * (u,v) =(x+ (-1)u,y + 2). (4.15) 


For a semi-direct product of two groups G and H to be defined, there must be a group 
action of Hon G. In this case he Z, acts on g €Z3 byh- g=(- 1)"g. For more information 
on semi-direct products, see Dummit and Foote [28]. 
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Exercise 4.5.7 Show that equation (4.15) makes the Cartesian product Z3 x Z, into a group. 


The group C; x C, is also called a dicyclic group (a special case of a metacyclic group). 
See M. Hall [38]. 

We include in Figures 4.2 and 4.3 the multiplication table and the Cayley graph 
X(G x G,{a,b}) from Group Explorer. Note that in these figures the element a has order 
4 and the element b has order 6. 


Figure 4.2 Group Explorer’s multiplication table for the semi-direct product C3; x C, 


Figure 4.3 Group Explorer draws the Cayley graph 
X(C3 x G, {a, b}) 


Exercise 4.5.8 Count the elements of order 2 in the three non-Abelian groups of order 12. 


150 


Part | Groups 


Exercise 4.5.9 Show that the group Z; XZ 4 can be identified with a group of 2 x 2 complex 
0 i 0) F 
matrices generated by the matrices (° ) and @ 2 where i= /—1=&"/?, andw= 


27/3. Here i? =—1, and i* = 1, while w* = 1. The group operation is matrix multiplication. 


Exercise 4.5.10 Consider the general linear group GL(3,Z,) consisting of 3 x 3 matri- 
ces g whose entries come from Z, such that det(g)40. The group operation is matrix 
multiplication. Show that |GL(3,Z,)| = 168. Find the center of this group. 


Hint. To find the order of GL(3,Z,), note that the first column can be any vector in Z3 
except 0. The second column can be any vector in Z3 except a scalar multiple of the first 
column. The third column must be outside the subspace of Z3 spanned by the first two 
columns. 


Exercise 4.5.11 Show that D3 is isomorphic to the affine group Aff(3) of matrices 6 ) 


with a,b €Z; and aA0. The group operation is matrix multiplication. 


Exercise 4.5.12 Show that the groups of order 12 on our list are not isomorphic. 


Exercise 4.5.13 Consider the affine group Aff(4) of matrices (¢ with b€ Z, and ace 


Z4, with group operation given by matrix multiplication. Which of the groups of order 8 in 
Table 4.1 is isomorphic to Aff(4)? 


For more about the finite matrix groups in the exercises, see Terras [116]. One of the 
favorite graphs in this book is the Cayley graph attached to the affine group Aff(p), for 
prime p, with generating set 


Sea={ é : [P=ay + oy = iy (4.16) 


for ) a non-square in Z, and a£ 0,40. In Figure 4.4, we see the special case X(Aff(5), S; 9). 
The graph is obtained by putting a star on every face of a dodecahedron. We call these 
graphs “finite upper half plane graphs” and will consider them again in Section 8.3. 


Figure 4.4 The Cayley graph X(Aff(5), S1,2), 
with generating set defined by equation 
(4.16), has edges given by solid green lines 
while the dashed magenta lines are the edges 
of a dodecahedron 
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Exercise 4.5.14 Draw the Cayley graph X(Aff(3), S12). You should get an octahedron. 


Exercise 4.5.15 Consider the Heisenberg group Heis(R) of matrices 


ors 


1 vA 
(x,y,z) = | 0 y |, withz,y,zER. 
10) 1 


The group operation is matrix multiplication. Show that Heis(R) is indeed a group and find 
its center. Then replace R by Zp for prime p. Find the center of this finite group. If p= 2, 
this finite Heisenberg group has order 8. To which group of order 8 is it isomorphic? 


The Heisenberg group with the field R replaced with a finite ring Z/qZ, q= p’, p prime, 
has some interesting figures associated with the spectra of some of its Cayley graphs. One 
generating set that has been considered is the four-element set S= {(+1, 0,0) ,(0,+1, 0)}. 
Since the Heisenberg group is not commutative, one needs representation theory to study 
the spectra of the adjacency matrices of such graphs. This is described in Terras [116, 
Chapter 18]. Beautiful pictures come from separating the spectra corresponding to the 
representations of the Heisenberg group Heis(Z/qZ) which are homomorphisms from 
Heis(Z/qZ) into GL(n,C). Figure 4.5 was obtained in this way by my student Marvin Minei. 
Taking larger and larger values of q leads to better and better approximations to a figure 
of D. R. Hofstadter, who was considering matrices analogous to those from graphs for the 
Heisenberg group over Z. Hofstadter’s butterfly is a fractal and appears to be the limiting 
figure of those for Heis(Z/qZ) as q— oo. Hofstadter was interested in the subject thanks to 
an application in quantum physics. You can find more information about this on Wikipedia. 


Figure 4.5 Butterfly from 
Cayley graph of Heis(Z/169Z) 


This section classified groups of order <15 and we did not give all the details. In par- 
ticular, many things were exercises using Sylow theorems. Moreover, we did not give an 
exercise to show that the list for order 12 is correct. You can find these details in Dummit 
and Foote [28, pp. 184-185]. That is perhaps not so impressive. Why did we stop there? 
The answer is that it would have taken many pages to explain why there are nine non- 
isomorphic non-Abelian groups of order 16 (and five Abelian order 16 groups). We leave it 
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to the interested reader to seek out the huge amount of information on the web conceming 
groups of small order. The small groups library for the computer program GAP has a list of 
representatives of isomorphism classes of groups of order < 2000. See also Wikipedia. The 
groups of orders 2” are particularly difficult to classify. It was only in the 1990s that it was 
found that there are 56092 groups of order 256. Higher powers of 2 get into the millions 
and billions. 

What is the good of such classifications, you may ask? Chemists are fond of the classi- 
fication of the space groups of a crystal. Recall that there are 230 such groups (219 if we 
do not distinguish between mirror images). The types were enumerated by E. S. Federov, 
A. Schoénfliess, and W. Barlow, independently in the 1890s. The symmetry groups of chem- 
ical quantities have an effect on the physical and spectroscopic properties of the molecules. 

The buckyball or Ceo, where now C stands for carbon and not cyclic, was discovered by 
R. Smalley and J. Kroto in 1985. It is a truncated icosahedron (a soccerball) whose faces 
consist of 20 hexagons and 12 pentagons. The icosahedral group As leaves it invariant 
and accounts for some of its properties. Chung and Sternberg [15] apply the representation 
theory (i.e., non-Abelian Fourier analysis) for the icosahedral group to explain the spectral 
lines of the buckyball. They find that the stability constant for the buckyball is greater than 
that for benzene considered in Section 4.2. 

The classification of groups is a major preoccupation of group theorists. From 1955 to the 
present at least 100 mathematicians have been working on classifying finite simple groups. 
Tens of thousands of pages of mathematics papers have been devoted to the project. Is the 
project finished? I have no idea. People seem to think there are no major gaps in the proofs. 
See Wikipedia and Wilson [127]. 

There are many types of finite simple groups: those occurring in infinite lists such as 
cyclic groups, alternating groups, Coxeter groups, groups of Lie type (e.g., groups like the 
projective special linear group PSL(n, Z,) = SL(n, Z,)/Z, where SL(n, Z,) is the special 
linear group of n x n matrices of determinant 1 and entries in Z, for prime p, and Z is its 
center - except when n= 2 and p=2 or 3). Then there are 26 sporadic groups. The list of 
sporadic groups ends with the monster group of order “8 - 10°*. See Wikipedia or Wilson 
[127] for more information on the classification of finite simple groups. 

Classifying infinite groups has also been a major project. The favorite sorts of continuous 
infinite groups are Lie groups. We saw examples of Lie groups after our consideration of 
Emmy Noether’s theorem in Section 4.3. Another example is SL(n, C), the special linear 
group of n x n complex matrices of determinant 1. The simple Lie groups over the complex 
numbers (one of which is SL(n,C)) have been classified. Here the meaning of the word 
“simple” is different from that of finite group theory. W. Killing did this classification in 
1887. See Stewart [114]. Again there are exceptional groups on the list. Evidently Killing 
was not happy about that. In 1894 E. Cartan rederived Killing’s theory in his PhD thesis and 
received most of the credit for the classification of Lie groups. Much of modern physics and 
number theory involves analysis on Lie groups over not just the complex numbers but also 
the reals and something called the field Q, of p-adic numbers. Then vectors with entries 
from all of these groups over R, C, Q, are put together to form one adelic group. Fourier 
analysis on such groups is used in Wiles’ proof of Fermat’s last theorem for example. 

Groups are deeply embedded in various kinds of mathematics. For example, algebraic 
topology involves fundamental groups, homotopy groups, homology, and cohomology 
groups. We restrict ourselves here to the fundamental group of a finite graph X. The ele- 
ments of this group are equivalence classes of closed directed paths on the graph starting 
and ending at some fixed vertex. Two paths in the graph X are equivalent iff one can be 
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continuously deformed into the other. The product of two paths C, D in X means first go 
around C and then D. It turns out that the fundamental group of X is a free group on r 
generators, where r is the number of edges left out of a spanning tree T in X. A spanning 
tree T for a graph X means a connected graph with no closed paths such that T has the 
same vertices as X. For example, if X is the complete graph on four vertices K, (alias the 
tetrahedron graph), a spanning tree is pictured as the solid fuchsia edges in Figure 4.6 and 
thus there are three dashed purple edges left out of K,. One closed path is also indicated by 
following the arrows around the outside triangle. You can create a topologically identical 
graph by collapsing the spanning tree to a point. This new collapsed graph is a bouquet 
of three loops as in Figure 4.7. It is fairly clear that each loop provides a generator of the 
fundamental group of the bouquet graph. So the fundamental group of K, is the free group 
on three generators. 


Figure 4.6 A spanning tree for the tetrahedron graph 
is indicated in solid fuchsia lines. Since the three 
dashed purple edges are left out, the fundamental 
group of the tetrahedron graph is the free group on 
three generators. The arrows show a closed path on 
the tetrahedron graph 


Figure 4.7 The bouquet of three loops obtained by 
collapsing the tree in the tetrahedron graph of 
Figure 4.6 to point a 


Free groups can be used to construct the group G with presentation (S: R) as a quotient 
of the free group generated by S modulo the normal subgroup generated by the relations R. 
There are some famous problems associated with such constructions. The word problem is 
that of deciding whether two words in the generators actually represent the same element 
of a group G defined by generators and relations. This problem was posed by Max Dehn 
in 1911. P. Novikov showed in 1955 that there is a finitely presented group whose word 
problem is undecidable. 

Dehn was famous for solving the third of David Hilbert’s 23 unsolved problems just 
after the problems were posed by Hilbert at the International Congress of Mathematicians 
in 1900. Dehn was the first to solve any of the problems, and some are still open. See 
Wikipedia for a list. There are now seven problems presented by the Clay Mathematics 
Institute in 2000 - of which only one has so far been solved. 


153 


154 


Part | Groups 


There are more undecidable problems relating to group presentations than the word prob- 
lem mentioned above. One is the problem of determining whether two groups defined by 
presentations are actually isomorphic. See Rabin [88]. 

We end our discussion of groups here with a passion flower from my garden with many 
different symmetries. The three-fold, five-fold, and ten-fold symmetries are obvious. I have 
tried to count the tiny hairlike petals and came up with a 90-fold symmetry. 


Figure 4.8 A passion flower 
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Rings: A Beginning 


5.1 Introduction 


In the preface we gave a partial introduction to rings - sets with two operations, usually 
called + and x. Our main examples of rings that are not fields (i.e., closed under division 
by nonzero elements) are the ring Z of integers and the ring of polynomials F [x] with 
coefficients in a field, like F=Z,, for prime p. These rings are called commutative rings 
since multiplication is commutative. This would be a good time for the reader to review 
what we said about the ring of integers Z in Chapter 1. Our main examples of fields are Q, 
C, and Z, for prime p. Here we add a few pictures made using the arithmetic of finite rings. 
The distinction between a commutative ring like Z and a field like Q is just the fact that in 
Q we can divide by nonzero elements and remain in Q, while division by n€ Z* tends to 
ship us outside Z. We will discover that many of the words we associated with groups like 
subgroup, group homomorphism, quotient group have ring analogs, and that will make the 
ring theory a bit easier to learn. 

Figure 5.1 comes from making an m x m matrix of values of x? + y* (mod m) for x, yé 
Z/nZ. Then Mathematica does a ListDensityPlot of the matrix. There is a movie of 
such things on my website letting m vary from 3 to 100 or so. 


Figure 5.1 The color at point 
(x,y) € Zig, indicates the value of 
x’ +y? (mod 163) 


Part II Rings 


158 


A more complicated finite field picture is that of Figure 5.2. It is associated with 2 x 2 
matrices with elements in the finite field Z,,. We will explain it in Section 8.3. It should be 


compared with Figure 2.7. 


So 
= a ee a en ee 
> bt ps be ty oe pt te ee 

ee et ee 


6 are equivalent 


rt+yv5 


with entries in Z,,. 


Figure 5.2 Points (x,y), for x, y€ Zy12, yA 0, have the same color if z 


The action of g on 


a b 
cd 


| 


z is by fractional linear transformation z— (az + b)/(cz + d) 


the field F121 with 121 elements 


-singular 2 x 2 matrices g 


under the action of non 


gz. Here 6 is a fixed non-square in 


Exercise 5.1.1 Make a similar picture to Figure 5.1, replacing x? +y’ (mod m) with 


ety 


(mod m) for some odd integer m. 


for x and a€Zy were 


e2t i(ax) /m 


(x) 


Exercise 5.1.2 The functions on Zm given by Xa 


considered in Section 4.2. From these functions one can build up trigonometric functions 
of two variables similar to those we graphed in Figure 2.5. Create a ListDensityPlot in 


Mathematica for 


and compare the result with Figure 2.5. 


5.2 What is a Ring? 


Our favorite ring for error-correcting codes will be Z, or Z,, where p is a prime. Other 


favorites are the ring of integers Z, the field of real numbers R, the field of complex numbers 


C, the field of rational numbers Q. 


Rings: A Beginning 


Definition 5.2.1 A ring R is an Abelian group under addition (denoted +) with a binary 
operation of multiplication (denoted -) which is associative, 


a-(b-c)=(a-bjc, for alla,b,ceR, 


and satisfies left and right distributive laws: 


a-(b+c)=a-b+a-c and (a+b)-c=a-c+b-c, foralla,b,ceR. 


We are assuming that multiplication is a binary operation as in Definition 1.8.11. 
Usually we will write xy=x - y. 


We will call the identity for addition 0. Multiplication in a ring need not be commutative. 
If it is, we say that the ring is a commutative ring. 

Also, the ring need not have an identity for multiplication (except that some people do 
require this; e.g. Artin [2]). If the ring does have such an identity, we say it is a ring with 
(two-sided) identity for multiplication and we call this identity 1 so that 1-a=a-1= 
a, Vae R. Some people (e.g., Gallian [33]) call the identity for multiplication a “unity.” 
The word unity seems too close to the word unit which means that the element has a 
multiplicative inverse. See Definition 5.2.3. 

If it exists, the identity for multiplication must be unique by the same argument that 
worked for groups. Most people might want to assume that 1 and 0 are distinct as well. 
Otherwise {0} is a ring with identity for multiplication. That must be the silliest ring with 
identity for multiplication. However, it looks like some people do call this a ring with 
identity for multiplication. What can I say? The terminology is not set in stone yet. The 
subject is still alive. However, I will normally assume that 140. 

The examples Z,, for n>2, Z, Q, R, C, are all commutative rings with our usual 
operations of addition and multiplication. 

It is possible to drop the requirement that multiplication be associative. We will not con- 
sider non-associative rings here. See Exercise 5.2.14 for an example. Imagine the problems 
if you have to keep the parentheses in your products because (ab)cé a(bc). 


Example 1. A Non-commutative Ring. Consider the ring R?* -{ (< A | a,b,c,d€ RI, 


with addition defined by 


ab a’ vU a+a’ b+0' 
(‘ i) - ¢ 4 7 (* c’ d+ ") (6.1) 
and multiplication defined by 
ab\ (a O'\ _ (aa'+be' ab’ + bd’ (5.2) 
cd cd’) \ca'+dc' cb’ +dd')° ; 
This ring is not commutative but it does have an identity for multiplication. What is the 
identity? A 


Exercise 5.2.1 Check the preceding statements about R2*?. 
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Example 2: A Commutative Ring without an Identity for Multiplication. 2Z, the ring of even 
integers. A 


Exercise 5.2.2 Check that 2Z is a ring without identity for multiplication. 


Proposition 5.2.1 (Properties of Rings). Suppose that R is a ring. Then, for all a,b,c ER, 
we have the following facts. Here —a denotes the inverse of a under addition and a — b= 
a+ (—b). 


(1) a-0=0-a=0, where 0 is the identity for addition in R. 

(2) a(—b) = (—a)b= —(ab). 

(3) (~a)(~b) = ab. 

(4) a(b—c) =ab — ac. 

(5) If R has an identity for multiplication (which is unique and which we call 1) then 


(fa=-3, nent 


Proof. First recall from Section 2.3 that, since R is a group under addition, the additive 
identity, 0, is unique as are additive inverses —a of elements a. 


(1) Using the fact that 0 is the identity for addition in R as well as the distributive laws, we 
have 


O0+a-0=a-0=a-(0+0)=a-0+a4-0. 


Upon subtracting a-0 from both sides of the equation, we see that O=a-0. You can 
make a similar argument to see that 0- a=0. 
(2) We have 


a(—b) + ab=a(—b+ b) =a-0=0= > a(—b) = —(ab). 


Which ring axioms are being used at each point? We leave it as an exercise to finish 
the proof of (2). 

(3) First note that —(—a) =a since a + (—a) =0. Then, by part (2) and the associative law 
for multiplication: 


(—a)(—b) = —(a(—b)) = —(—ab) = ab. 


(4) We have, using the distributive laws and part (2), 


a(b — c) =ab+ a(—c) =ab — ac. 


(5) We leave this part as an exercise. A 
Exercise 5.2.3 Finish the proof of part (2) of the preceding proposition. 
Exercise 5.2.4 Prove part (5) of the preceding proposition. 


Next we will define a subring in an analogous way to the way we defined a subgroup 
in Section 2.4. You should be able to write the definition yourself without looking at what 
follows. Just do not forget to say the subring is non-empty. 


Rings: A Beginning 


Definition 5.2.2 Suppose that R is a ring. If S is a non-empty subset of R which is a 
ring under the same operations as R, we call S a subring of R. 


Proposition 5.2.2 (Subring Test). A non-empty subset S of a ring R is a subring of R iff S 
is closed under subtraction and multiplication. 


Proof. <= Assume Sis closed under subtraction and multiplication. The one-step subgroup 
test from Proposition 2.4.3 implies that S is a subgroup of R under addition. Moreover S 
must be Abelian under addition since R is. Since S is closed under multiplication, we are 
done because the associative law for multiplication, plus the distributive laws, follow from 
those in R. 

==> Suppose S is a subring of R. Clearly S must be closed under subtraction and 
multiplication. A 


Example 1. {0} is a subring of any ring R. 
To see this, apply the subring test. First note that -O = 0 and thus 0 -0 =0+ 0=0. Also 
0-0=0, by multiplication rule (1). A 


Example 2. S= {0,3,6,9 (mod 12)}={3x| x€Zj>} is a subring of Zp. 
To see this, use our subring test. Then 3.x — 3y = 3(x — y) and (3x) (3y) = 3(3xy) are both 
in S. A 


Example 3. nZ is a subring of Z for all n€ Z. 
To see this, use the subring test as in the preceding example. A 


Example 4: The Gaussian Integers. Z[i]={a+ bi|a,b€Z} is a subring of C. Here 
i= /-I. 

Again we use the subring test. To see that Z[i] is closed under subtraction and 
multiplication, note that 


(a + bi) — (c+ di) = (a—c) + (b— die Zi, 
and (a+ bi)(c+di) = (ac— bd) + (ad+ bc)i€ Zfi), 


since a, b,c,d€ Z implies a — c, b — d, ac— bd, ad+ bc EZ. A 
Example 5. The real numbers R form a subring of the complex numbers C. A 
Example 6. The ring Z is a subring of Z[,/—5] = {a+ b\/—5 | a,beZ}. A 


Exercise 5.2.5 


(a) Show that 2ZU 5Z is not a subring of Z. 
(b) Show that 2Z+5Z={2n+ 5m|n, me Z}=Z. 
(c) Show that 2ZM 5Z= 10Z. 
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Exercise 5.2.6 Consider the set 


e={(0 4) 


Assume that addition is as in equation (5.1) above and multiplication is the usual matrix 
multiplication as in (5.2). Prove or disprove: R is a subring of the ring Z?** of all 
2 x 2 matrices with integer entries under componentwise addition and the usual matrix 
multiplication. 


a,b,ceZ} 


We have already seen examples of the following definition - when we considered the 
unit groups Z*. As with this example, it is important that the inverse of the element of 
the ring be located in the same ring. 


Definition 5.2.3 Suppose R is a ring with identity for multiplication (which we call 
1A 0). The units in R are the invertible elements for multiplication in R. The set of 
units is 


R* = {a€ R| 3b € R such that ab = 1= ba}. 


If ab=1=ba, writeb=a7!. 


Proposition 5.2.3 If R is a ring with identity for multiplication, the set of units R* forms a 
group under multiplication. 


Proof. We need to check four things. 


(1) R* is closed under multiplication. 

(2) The associative law holds for multiplication. 

(3) R* has an identity for multiplication. 

(4) If ae R*, then a~!€ R*; that is, R* is closed under inverse. 

To prove (2), you just need to recall that the associative law holds in R. 

To prove (3), just note that 1-1= 1. 

To prove (4), let a¢ R*. Then there is an element a~! in R such that aa~! =a~'a=1. But 
then a =(a~!)~' and thus a7! € R*. 

To prove (1), suppose a, b<€ R*. Then we have a~! and b~' in R and so - making use of the 
associativity of multiplication: 


(ab) b-'a~! =abb-'a~' =1. 
Similarly b~'a~' (ab) = 1. It follows that ab € R* with inverse b~'a7!. A 


One moral of the preceding proof is that in a non-commutative ring, (ab) ' =bat, 


We knew this already from Exercise 2.1.8. 


Example 1. Z* = {1, —1}. 
To see this, just note that if n and i are both in Z, then n must be 1 or —1. Otherwise, 
|n| > 1 and O< a <1. This contradicts Exercise 1.3.13. A 


Rings: A Beginning 


Example 2. Z;, ={a (mod n)| gced(a, n) = 1}. 
See Section 2.3 for the proof. A 


Example 3. Z/[z] is the ring of polynomials in one indeterminate x with integer coefficients. 
A 


The elements of Z[x] have the form f(x) = a,x" + a,_,2"~! +--+ + a,x + do, where a; € Z. 
If a,A 0, we say that the degree of fis n= deg f The zero polynomial is not usually said to 
have a degree (unless you want to say it has degree —oo). We call x an “indeterminate” and 
not a “variable” because we must distinguish between polynomials and functions when we 
replace Z with Z,,. We will say more about this later. 

To add two of these polynomials, if the degree of fis n and the degree of gis m<n, 
put in some extra terms for g with coefficients that are 0, if necessary. Then you just add 
coefficients of like powers of +, that is: 


FL) = yt” + ay) +--+ ay + ao, 
g(x) = dar" + dye"! + +--+ di x4 do 


give 
S(2) + g(x) = (an + On)x" + (dn—1 + Dn-1)a! ++ + (a1 + bi)a + (do +0). (5.3) 
Multiplication is more complicated but you have known how to do this since high school. 


We know that we want the operation to be associative and distributive. So suppose we are 
given polynomials 


Sf) = yx" + Gy_yx" 1 +--+ + ax + ao, 


g(x) = Dax” + Ome” | Hess + byt bo. 


The product of f(x) and g(x) is 


fla)g(x) = (anx” + dpe hoe hae ao) (Bmx + bye 


= AyD” + andm— ert! te Heagbya't! + ay box" 
te (ge QE Oy) (Dg A Beg gk Ss eo = DEE Dy) 


= Anbmx™ + (dnDm-1 4 Gabe tee. 


+ +++ +b x + bo) 


+ (= ajb; geet (abo + dob )x + aobo. 


it+j=k 


Thus the product of the polynomials f(x) and g(x) is 


Aiaya(e)=S5 (>So | 2". (5.4) 
k 


k=0 \itj= 


The sum and product are still in Z[x]. Checking the other ring properties is a bit tedious. 
The zero polynomial has all its coefficients equal to 0. The additive inverse of f(x) has as 
its coefficients the negatives of the corresponding coefficients of f(x). The multiplicative 
identity is the degree 0 (constant) polynomial f(x) = 1. Checking the associative law for 
multiplication is the worst. 
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Assuming that no polynomial in the formula below is the zero polynomial, we have 


deg (fg) = deg f + deg g. (5.5) 


Definition 5.2.4 Suppose that f(x) is a polynomial with coefficients in some ring R. A 


root 0 of fis an element of a possibly larger ring than R such that 0 is a solution of the 
equation f(0) =0. 


Thus, for example, V2 € R is a root of x7 — 2=0 and i€ Cis a root of x7 + 1=0. 


Exercise 5.2.7 Complete the proof that the polynomial ring Z|2| is a commutative ring with 
identity for multiplication. Do your arguments work if you replace Z by any commutative 
ring R with identity for multiplication? 

Exercise 5.2.8 What is the analog of formula (5.5) for deg(f+ g)? 


Hint. Consider an inequality rather than an equality. 
Question. What is the group of units in the polynomial ring Z/x]? 


Given f(x) € Z|a], suppose fg= 1, for some g(x) € Z[x]. This implies by formula (5.5) that 
deg f+ deg g =0. The only way that can happen is if deg f= deg g= 0. Thus the units of 
Z|x] are the nonzero constant polynomials that are units in Z itself, which implies 


(Z[x])" = Z* = {1, —1}. 


Of course, if deg f>0, you can still consider 1/f{x) but instead of a polynomial you get 
an infinite series - not a polynomial. For example, the geometric series is 


n=0 


Moreover, this is only a convergent series if |x| <1. But algebra is not supposed to deal 
with convergence and limits. Instead an algebraist would view this as a “formal power 


series” with coefficients in some ring R: as an element of R/[1]], whose elements look like 
co 


So ana”, with a, € R. This is in line with our insistence on viewing x in a polynomial f(x) 
n=0 
as an indeterminate rather than a function. 


Exercise 5.2.9 Find the group of units in R|2|, the ring of polynomials with real coefficients, 
where the formulas for addition and multiplication are the same as those for Z |x], namely, 
formulas (5.3) and (5.4). 


Exercise 5.2.10 Check that the set C(R) consisting of all continuous real-valued functions 
on the real line forms a commutative ring if you define (f+ g)(x) = f(x) + g(x) for allxER, 
and (fg) (x) =fix)g(x), Vx € R. Here we assume that f, g are in C(R). Does this ring have an 
identity for multiplication? 


Rings: A Beginning 


Exercise 5.2.11 Find the units in the ring Z’** of 2 x 2 matrices with integer entries and 
the usual matrix operations. 


Note that, for a ring R, we distinguish polynomials in R[x] from functions f: R— R. This is 
important when R is finite like Z,, for there are only finitely many functions but infinitely 
many polynomials. 


Exercise 5.2.12 How many functions are there mapping Z,, into Z,? 


Exercise 5.2.13 Given two indeterminates, x and y, we can create the ring R = (Z[x]) [y| = 
Z\x, y|. Define a monomial to be an element of R of the form f(x,y) =cxy", with ceé Z. 
Define the degree of f to be m+ n, assuming that c& 0. Then any polynomial in R is a sum 
of monomials. Define the degree of an arbitrary polynomial p(x, y) in R to be the maximum 
degree of the monomials cx"y" in p with cKO. Show that deg(pq) = deg p + deg q, for 
p,qd€ R. What are the units in R? 


Exercise 5.2.14 (Example of a Non-associative Ring). Consider the ring M=R"*" consist- 
ing of all n x n matrices with real entries. Addition is the usual componentwise addition 
of matrices. Define multiplication to be given by the Lie bracket [A,B] = AB-— BA, for 
A,BeR"*". Show that this multiplication is not associative but instead satisfies the Jacobi 
identity: 


[A, [B, C]] + [B, [C,A]] + [C, [A,B]]=0, for A,B,CER"*". 


The ring in the last exercise is the Lie algebra of the Lie group GL(n,R) consisting of 
non-singular n x nreal matrices with the operation of matrix multiplication. The dictionary 
relating Lie groups and Lie algebras is perhaps the most important tool in the study of Lie 
groups. See Terras [119]. 


Exercise 5.2.15 Consider the ring F[x,,...,X,| of polynomials in n indeterminates x,,...,X, 
over a field F. Imitate Exercise 5.2.13 for this ring. 


Exercise 5.2.16 Consider the ring of integer coefficient formal power | series Z||x Ii in the 


indeterminate x, where we define the sum and product as follows for 5 aa and Sh ae 


n=0 n=0 
with ay, bn € Z: 


co 


ye woh =S5 (Gn + bn) 2" 


n=0 


Sant") b= 5) S> Ambn |x”. 
m=0 n=0 k=0 


m+n=k 
O<m,n<k 


Show that Z|[x]] is a commutative ring R with identity for multiplication such that ab =0 
implies either a=O or b=0, fora, b in R. 
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5.3 Integral Domains and Fields are Nicer Rings 


In this section we want to consider rings that are more like the ring of integers Z or the 
ring of rational numbers Q. 


Definition 5.3.1 If R is a commutative ring, we say a~0 in R is a zero divisor if ab =0 


for some bE R such that bA0. 


Example. In R= Ze, both 2 and 3 (mod 6) are zero divisors since 2 - 3 = 0 (mod 6). However, 
R=Z, has no zero divisors. A 


Definition 5.3.2 If R is a commutative ring with identity for multiplication which has 


no zero divisors, we say that R is an integral domain. As usual we assume 1A 0. 


One might be forgiven for thinking that zero divisors are “bad” and thus integral domains 
are “good.” Of course we have also been thinking Z¢ is pretty nice and it is clearly not an 
integral domain. Perhaps we should just think that Zs is way nicer than Ze. 


Example 1. Z is an integral domain as are R, C, and Q. We know this for Z as it was in our 
axioms of Section 1.3. We will not discuss the axioms for R and C. Yes, it involves limits and 
that is calculus. See Cohen and Ehrlich [17] or Lang [63] or most advanced calculus texts 
for more information. As to the rationals, we assume you know how to deal with fractions 


and discuss Q more carefully in Section 6.4. If it seems bad to wait, feel free to jump 
ahead. A 


Example 2. Z,, is not an integral domain if n is not a prime. 
To see this, note that if n is not prime, then n =ab, where 0 <a,b <n. But then neither 
a nor b can be congruent to 0 mod nv and thus a and b are both zero divisors. A 


Example 3. Z, is an integral domain if p is a prime. 

To see this, note that ab=0O (mod p) <= p divides ab. Then, by Euclid’s lemma 
(see Lemma 1.5.1), this means p divides either a or b. So either a or b is congruent 
to O (mod p). A 


Putting the two preceding examples together, we see that Z, is an integral domain iff n 
is a prime. 
The following lemma shows that cancellation is legal in an integral domain R even though 


inverses of nonzero elements may not exist in R. 


Lemma 5.3.1 (Cancellation Law in an Integral Domain). Suppose that R is an integral 
domain. If a,b,c€ R, aK 0, and ab=ac, then b=c. 


Proof. Since ab =ac, we see that 0=ab — ac=a(b —c). Since aA 0 and R has no zero 
divisors, it follows that b — c=0. Thus b=c. A 


Rings: A Beginning 


Ox 
5 3) forse Z, with the usual 


componentwise addition and matrix multiplication. Show that R is indeed a ring. Is it an 
integral domain? 


Exercise 5.3.1 Consider the ring R of matrices of the form ( 


Integral domains R are nice, but maybe not nice enough. Suppose we want to know that 
a_' ER for any a ER — {0}. Then we want a field. Of course, you can construct a field out 
of an integral domain by imitating the construction of Q out of Z, but that is another story 
to be told in Section 6.4. 


Definition 5.3.3 A field F is a commutative ring with identity for multiplication such 


that any nonzero element a€ F has a multiplicative inverse a~' € F. 


It follows from this definition that if F is a field, then the multiplicative group of units 
F* =F — {0}, which is as big as the unit group could be. Yes, there is no way it is ever legal 
to divide by 0. 


Proposition 5.3.1 (Some Facts about Fields). 


(1) Any field F is an integral domain. 
(2) Any finite integral domain D is a field. 


Proof. 


(1) If a,b € Fsuch that ab=0 and aA0, then b=a~'ab=0. So Fhas no zero divisors. 

(2) We just need to show that D* = D — {0} is closed under multiplication and multiplica- 
tive inverse. It is closed under multiplication because D has no zero divisors. Finiteness 
will force it to be closed under multiplicative inverse by the same argument that proved 
the finite subgroup test (Proposition 2.4.4) - 


Exercise 5.3.2 Show that the argument that proved the finite subgroup test (Proposi- 
tion 2.4.4) finishes the proof of Proposition 5.3.1. 


Example 1. Z is not a field as the only units (i.e., invertible elements for multiplication) in 
Z are 1 and —1. A 


Example 2. Z, is a field iff p is prime. Why? Recall Example 3 above and the preceding 
proposition. A 


Exercise 5.3.3 State whether the following rings are integral domains. Then say whether 

they are fields. Give reasons for your answers. 

(a) The set R[x] of all polynomials in one indeterminate with real coefficients and with 
addition and multiplication defined in (5.3) and (5.4) in Section 5.2; 

(b) C(R)={f:R-R | f continuous} with pointwise addition and multiplication of 


J,9< C(R) defined by (f+ g) (x) =flx) + g(x) and (f- g) (x) =flx)g(a), for all xe R. 


Example 3. The set of rational numbers Q= {4|a,beZ, bAo} is a field. See 
Section 6.4. A 
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Example 4. The set of real numbers R is also a field - as is the set of complex numbers 
C. Again we leave this to the advanced calculus texts for IR. However, if you know that 
R is a field we make it an exercise to see that C is a field. If you think of R as the set of 
infinite decimals, you should be able to get pretty close to proving R is a field. You can 
add, subtract, multiply and divide them presumably. I prefer to think of real numbers as 
limits of Cauchy sequences of rationals myself. A 


So we could view the finite field Z, for p prime, as an analog of the field R of real 
numbers. But the picture of R is a continuous line without holes, while our picture of Z, is 
a finite circle of points. If p is large, a penguin sitting on a finite circle might think it was 
a continuous line - just as the earth looks flat to the creatures living on it. It is certainly 
possible to create a finite analog of a sphere as well by looking at finite analogs of the 
rotation group and viewing the sphere as a quotient of O(3, R)/O(2, IR), where we view 
O(2, IR) as the rotations fixing the north pole on the sphere. Replace R by Z, to get a finite 
analog. 


Exercise 5.3.4 Assuming R is a field, prove that C is a field. 
Question. Are there other finite fields besides Z, for prime p? 


Answer. Yes, for example, you can imitate the construction that gives the complex numbers 
C. First note that —1A a? for all a€ Z;. Thus we can consider i to be some creature outside 
Zs such that i? = —1. Of course, we are not thinking that i € C. So if you want to be careful 
replace i by some letter with no particular meaning. I will usually use 0. But - reader 
beware — we cannot replace 3 by 5 in this construction since 27 =—1 (mod 5). 


Anyway, we can define 
Fy = Z,[i] ={a+bi| a,be€Z,}, where *# =-1. 


The order of Fo is 9 since there are three choices of a and three choices of b in a+ ib. 
You add and multiply in F, just as you would in the complex numbers, except that every 
computation is modulo 3. Thus for a, b,c, d€ Zs, we define 


(a+ bi)+(c+di)=(a+c)+(b+d)i and 
(a+ bi) - (c + di) = (ac — bd) + (ad + be)i. 


Why does this make Z,[i] a field? You get a ring by the same arguments that work to 
see the complex numbers form a ring. To see that Fy is a field, we need to see that if a + ib 
is a nonzero element of Fo, then it has a multiplicative inverse in F>. Again use the same 
argument that works for C. That is, 


L-— 4 a—ib  a—ib — a “5 —b 
a+ibat+iba—ib @+eh @+eP ete 


Why is a? + b*A 0? This is a little harder to prove than it would be if a,b € R. We know 
that a+ ib O => either a or b is not 0 in Z;. Suppose a is not 0 in Z3. Thus a= 1 or 
—1 (mod 3). So a?=1 (mod 3). Then b?=0 or 1 (mod 3). It follows that a? + b* =1 or 
2 (mod 3). Thus a? + b’ is not 0 (mod 3). Since Z; is a field, we know that 1/(a’ + b) €Z3. 
The same argument works if b is not 0 in Z3. 


Rings: A Beginning 


Note that, for any element z€Fo, 3z=z+z+z=0. This happens because z= x + iy, 
with x,y € Z, and 3z=34r + i3y =0. We say that IF, has characteristic 3. 


Definition 5.3.4 The characteristic of a ring R is the smallest n€ Z* (assuming such an 
n exists) such that 


nx=x+x+---+x=0, forall xreR. 
ao ee” 


n times 


If no such n exists, we say that R has characteristic 0. 


Some authors (e.g., Birkhoff and Maclane [9]) do not say characteristic O but instead say 
characteristic oo. 


Lemma 5.3.2 Suppose that R is a ring with identity 1 for multiplication. Then we have the 
following facts. 


(1) If the additive order of 1 is not finite, then the characteristic of R is 0. 
(2) If the additive order of 1 is n, then the characteristic of R is n. 


Proof. 


(1) We leave this as an exercise. 
(2) First note that the characteristic must be divisible by n. Why? On the other hand, if 
1+---+1=0, thenVreR 
——_ 


— 
n times 


(_eigti)r= perp dyno 
Ss 


———— 
n times n times 


This means that the characteristic is <n. It follows that the characteristic must equal n. 
A 


Exercise 5.3.5 Prove part (1) of the preceding lemma. Then answer the “Why?” in the proof 
of part (2) of that lemma. 


Example 1. Z, Q, R, C all have characteristic 0. To see this, by the preceding lemma, you 
just need to note that no finite sum of 1s can equal 0 in these rings. A 


Example 2. Z, has characteristic p by the preceding lemma since that is the additive order 
of 1 in Zp. A 


Example 3. Fg =Z 3 + iZ3 has characteristic 3 since 3 is the additive order of 1 - again 
using the preceding lemma. I’, has nine elements. Thus the order of a finite field need not 
equal its characteristic. A 


Example 4. Z,|[2], the ring of polynomials in one indeterminate with coefficients in Z,, has 
characteristic p. Z|] has infinitely many elements. A 
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Exercise 5.3.6 Prove the preceding statements about Examples 1-4. 


Recalling the definition of isomorphic groups, you should feel no surprise upon reading 
the following definition. 


Definition 5.3.5 Two rings R, S are isomorphic if there is a 1-1 and onto function (ring 


isomorphism) f: R— S preserving both ring operations. If rings R and S are isomorphic, 
we will write R& S. 


For most purposes we can identify isomorphic rings, just as we identified isomorphic 
groups earlier. In 1964 D. Singmaster asked how many rings of order 4 (up to ring isomor- 
phism) are there? The solution was given by D. M. Bloom (11 rings of order 4, of which 
three have a multiplicative identity [108]). See Wikipedia, and the website on small rings 
from an abstract algebra class of Gregory Dresden [27]. Of course one needs to define direct 
sum of rings and quotient rings before trying to answer such questions. You may be able 
to guess what these things are from your knowledge of direct sum of groups and quotient 
groups, and group isomorphism. So, for example, Benjamin Fine [31] has shown that there 
are 11 non-isomorphic rings of order p* for any prime p. 

There are two non-isomorphic rings of order p, if p is a prime. The one we know well is 
Zp. The other one has the same additive group but every product is defined to be 0, We call 
that ring Z,(0). It seems like a pretty silly ring to me. I would ban it from this book if I had 
the power. You might defend the ring Z,(0) by saying that it is isomorphic to the ring of 
matrices: 


e={ (94) 


where the ring operations are matrix addition and multiplication. We considered a similar 
ring in Exercise 5.3.1. 


#e zn} 65.6) 


Exercise 5.3.7 Prove that Z,(0) and R defined by (5.6) are isomorphic rings. 


Instead of considering the rings of order p’, for prime p, we note that there are four non- 
isomorphic rings of order pq, where p and q are distinct primes (see Fine [31]), but we will 
not give the proof here. These rings are Z,,, Zpq(0), Z,(0) @ Z,, Z, & Z,(0). Here direct 
sum of rings R @ S, for rings R, S, is defined as for groups - with the sum and product 
defined componentwise: 


(a,b) + (c,d)=(a+c,b+d) and (a,b)(c,d) = (ac, bd), if a,cER,b,deES. 


We will give a detailed discussion of direct sums of rings in Section 5.4. 


Exercise 5.3.8 Find the characteristics of the four non-isomorphic rings Zpq, Zpq(0), 
Z,(0) @ Zq, Zp & Zq(O) of order pq, where p and q are distinct primes. 


Wikipedia lists the numbers of non-isomorphic rings of various small orders, which 
it obtains from the On-Line Encyclopedia of Integer Sequences (sequence listed under 
A027623). However, as in the case of order p, some of the rings would no doubt seem silly 
to me. Somehow I do not have the same feeling about any of the small groups. Of course 


Rings: A Beginning 


every Abelian group gives rise to a ring all of whose products are defined to be O. Yes, I 
find these rings to be somewhat silly. 


Lemma 5.3.3 Suppose that R is an integral domain. Then the characteristic of R is either a 
prime or 0. 


Proof. If the additive order of 1 is not finite, the characteristic of R is 0, by the preceding 
lemma. 

Suppose the additive order of 1 is n¢ Zt. We must show that n is prime. We do a proof 
by contradiction. 

Otherwise n= ab for some integers a,b with 0< a,b<n. This means O=a- b=(a-b)- 
1=(a-1)-(b- 1). Since R has no zero divisors, it follows that either a- 1=0 or b- 1=0. 
But this contradicts the minimality of n= ab. Therefore n is prime. A 


Question. Which of the following five ring examples are fields - using the standard ring 
operations? 


Z; Zi] ={a+ bi|a,beZ}, where ieC, ? =—1; 

Z |x] = polynomials with integer coefficients in one indeterminate +; 

Z[V—5]; Zp», for prime p. 
Answer. Only the last example Z,, for prime p, is a field. In all other cases i is not in the 
ring, even though 2 is. 


We define subfield just as we defined subgroup or subring. 


Definition 5.3.6 A subset F of a field E is a subfield if it is a field under the operations 
of E. We also say that E is a extension field of F. 


Proposition 5.3.2 (Subfield Test). Suppose that E is a field. Then a non-empty subset F of 
E is a subfield of E iffVa,b <F with bA0, we have a— b and ab“! €F. 


Proof. Just use the one-step subgroup test (from Proposition 2.4.3) on F to see that it 
is an additive subgroup of E and then use the same test again to see that F — {0} 
is a multiplicative subgroup of E — {0}. The commutative and distributive laws are 
automatic. A 


The following lemma gives an equation in characteristic p& 0 which some calculus stu- 
dents seem to believe is true in the real numbers. But that would mean most of the terms 
in the binomial theorem (see Exercise 1.8.12) somehow vanish miraculously. 


Lemma 5.3.4 Suppose that R is an integral domain of nonzero characteristic p, which is 
necessarily a prime. Then Vx, y€ R, we have (x + y)? =x? +". 


171 


172 


Part II Rings 


Proof. By the binomial theorem (whose proof works in any integral domain as shown in 
Exercise 1.8.12), we have 


(x+y) = ss (1) a ae 


k=0 


Here we interpret the terms in the sum as products of positive integers with elements of 
R as in Definition 5.3.4. To finish this proof, we must show that the prime p divides (7) 
if R=1,2,...,p— 1. This follows from the fact that the binomial coefficient is an integer 
which is represented by the fraction: 


(ee 


k i eee ee 


Since p clearly divides the numerator, we just need to show that p does not divide the 
denominator. But that is true (by Euclid’s Lemma 1.5.1 or unique factorization into primes) 
since p divides no factor in the denominator. This means that the binomial coefficients (?) 
that are not congruent to 0 mod p are only those of the R=0 and k=p terms in the sum 
representing (x+ y)’. A 


Exercise 5.3.9 Which of the following rings are integral domains and which are fields? Give 
a brief explanation of your answer. 


(1) Z[i]={a+bi| a,beZ}, where ic C, i? =—1. The ring operations are the standard 
ones in C. 

(2) Z/12Z with the usual ring operations mod 12. 

(3) as 2 x 2 matrices with coefficients in Z,. The ring operations are as in equations 
(5.1) and (5.2) of Section 5.2. 

(4) Z,, with the usual ring operations mod 12. 

(5) Z®Z with componentwise addition and multiplication: (a,b) + (c,d) =(a+c,b+d) 
and (a, b) - (c,d) = (ac, bd), for a,b,c, d€ Z. 

(6) Q, the rational numbers with the usual ring operations. 

(7) R[x], if R is a ring, x an indeterminate. 


Exercise 5.3.10 Are the following rings integral domains? Are there elements of the rings 
that are neither units nor zero divisors? 


(a) ClO, 1] = {continuous real-valued functions f: [0,1] + R} with addition and multipli- 
cation defined (as usual in calculus) pointwise, that is, 


(f+ 9)(x) =fla) + g(x) and (fg)(x) =flz)g(x), Vre[0, 1]. 
(b) C(Z,) = {f: Zn, > R} with addition defined as in part (a) but multiplication defined by 
convolution from equation (4.1) of Section 4.2. 


Exercise 5.3.11 


(a) List all the zero divisors - if any - in the seven rings from Exercise 5.3.9. 
(b) List all the units in the rings R of part (a); that is, find R*. 
(c) What is the relation between the zero divisors and the units of R, if any? 
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Exercise 5.3.12 Show that if D is an integral domain of characteristic 0 and D' = (1) is the 
cyclic subgroup of the additive group of D generated by 1, then D’ and Z are isomorphic 
rings. This means that you must find a 1-1 function T mapping Z onto D' which preserves 
addition and multiplication. 


In Section 6.1 we will have a similar exercise to the last one for fields of prime 
characteristic p. See Exercise 6.1.9. 

A division ring is a non-commutative ring with identity such that the nonzero elements 
form a group under multiplication. An example is the quaternions defined in equation (3.4) 
of Section 3.6. Wedderburn proved that a finite division ring is a field. For a proof see 
Herstein [42]. 


Definition 5.3.7 An ordered field F is a field with a subset PC F of positive elements 
having the properties O01, O02, and 03, as in Section 1.3: 


O01 F=PU{0}U(—P), where —P={—x | x€P}. Moreover this is a disjoint union, 


that is: 
O¢P, O0¢-P, PN(-P)=?%. 
02 noameEeP=n+meP. 
03 nme P= >n-meP. 


Then we define, for a,b€ F, a<b to mean that b —a€ P. This ordering will have the 
same properties as that of Z discussed in Section 1.3. The fields of rational numbers Q and 
real numbers R are ordered fields. The field of complex numbers C is not an ordered field. 


Exercise 5.3.13 Show that a subfield of an ordered field is an ordered field. 


Exercise 5.3.14 Show that C is not an ordered field. 


5.4 Building New Rings from Old: Quotients and Direct Sums of Rings 


We need to build quotient rings in the same way that we constructed Z/nZ. We will also 
imitate the construction of quotient groups in Section 3.4. To create a quotient group using 
a subgroup H of a group G, it was necessary that H be a normal subgroup. It turns out we 
will need a similar notion for the subring S of ring R. That is, we will need S to be an ideal 
in R as in the next definition. 


Definition 5.4.1 The non-empty subset A of a ring R is an ideal iff A is an additive 


subgroup of R such that rac A and are A, Vre R and Vac A. 


Some would say “two-sided ideal” rather than ideal. Note that every ring R has two 
ideals: {0} and R. Any other ideal is called a proper ideal. 


Example. nZ is an ideal in Z. We call nZ the principal ideal generated by n and write 
nZ = (n). A 
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Definition 5.4.2 Given a ring R and an element a€ R, the (2-sided) ideal generated by 
a, denoted (a), consists of elements ra and ar for all re R. Such an ideal (a) is called a 


principal ideal. Similarly, the ideal (S) generated by a subset S of R is the smallest ideal 
containing S. 


Exercise 5.4.1 (The Ideal Generated by a Set). 
(a) Suppose that R is a commutative ring with identity. Let SC R. Show that 


(S) = Sonsi 


ixi 


Vn, ER, Vs; S,Vne€ a 


is indeed an ideal. We call it the ideal generated by S. 
(b) Suppose R= Z. Show that if a,b€ Z, then ({a, b}) = (gced(a, b)). 


Ideals were introduced by Richard Dedekind in 1879. The main use for them in number 
theory is to get a substitute for prime numbers - the prime ideals we are about to define. 
This allows one to have unique factorization of ideals in rings of integers of algebraic 
number fields into products of prime ideals, though the unique factorization fails for actual 
algebraic integers in a ring like Z len if " for most values of n. Another way to say that is 
not all ideals in Z ler" | are principal ideals. The concept of ideal was further developed 
by David Hilbert and Emmy Noether. We will find applications in error-correcting codes 
and random number generators. Ideal theory was not accepted by everyone. In particular, 
Leopold Kronecker objected to ideal theory just as he objected to the work of Karl Weierstrass 
in analysis and Georg Cantor in set theory. It appears at the moment, however, that history 
has found that Kronecker was the loser in all these wars. 

To construct the quotient ring R/A, assuming that A is an ideal in the ring R, we create 
the set of additive cosets [x] =x + A= {x+a|ac€A} for each re R. Once again, you can 
view these cosets as equivalence classes for the equivalence relation defined on elements 
xyeRby x~yiffx— yea. 


Exercise 5.4.2 Prove this last statement. 


Then we add and multiply cosets as we did for Z,: 


K]+bl=b+y, i) bl= ol. (5.7) 


This defines the quotient ring (or factor ring) R/A. 


Theorem 5.4.1 Suppose that A is a subring of the ring R. Then, with the definitions just 
given in (5.7), R/A is a ring iff A is an ideal in R. 


Proof. << If A is an ideal in R, we need to see that the operations defined in equation (5.7) 
make R/A a ring. Once we have checked that the operations are well defined, everything 
else will be easy. To check the operations make sense, suppose that |] = |x’] and [y| = [y’]. 
Then we must show that [x + 9] = [x + y’] and [xy] = [2’y’]. In fact, we have already checked 
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the additive part in Section 3.4, since A is automatically a normal subgroup of the additive 
group of R. Why? 

So we will just check the multiplicative part. We need to prove that xy — x’y’ € A. To do 
this, we recall proofs of the formula for the derivative of a product. That means we should 
add and subtract 2’ y (or xy’). This gives 


sy —XKy'=sry-—ceyt+ry—ey=Aa-L)yt+r(y—y). 


Since both x— x and y— y are in the ideal A, it follows that (x — x’) y and 2’(y — y’) are 
both in A. But then the sum must be in A and we are done. 

So now we know addition and multiplication in R/A are well defined. From the fact that 
R is a ring, it is easy to see that R/A must be a ring too. The identity for addition in R/A 
is [0]. The additive inverse of [a] is [—a]. The associative laws in R/A follow from the laws 
in R, as do the distributive laws. 
==> Conversely, if R/A is a ring, the multiplication of cosets defined in (5.7) must be well 
defined. Since [0] =A, for any a€ R, we have [0] [a] = [aj [0] = [0]. This means ARC A and 
RAC A. Of course A must also be closed under addition and subtraction as [0] + [0] = [0] 
in R/A. Thus A must be an ideal in R. A 


Example. Consider the ring R[x] of polynomials in the indeterminate x. An example of an 
ideal in this ring is the principal ideal generated by x? + 1: 


A=(? +1)={f(2)( +1) | AX ERE}. A 
Question. What is R[x}/ (2? + 1)? 


Answer. We can identify this quotient with the ring C of complex numbers. To give some 
evidence for this statement, let 0=[x]=xr+A=x+ (2 +1) in R[a|/A. Then 6? + 1= 
[x] + [1] =[2? + 1] =[0]. This means 6” = —1 in our ring R[x|/A. So 6 behaves like iin C. 

In order to prove our statement identifying C and R{x]/ (x* + 1), we need to study poly- 
nomial rings a little more. See Section 5.5. In particular, we need the analog of the division 
algorithm for polynomial rings like R[x]. Once we have that, we can identify cosets [f(x)| 
in R[x] /A =R{x|/ (x? + 1) =RIaz]/ (x? + 1) R[x] with cosets of the remainders [r(x)] upon 
dividing f(x) by x” + 1; that is, f(x) =(2* + 1) q(x) + r(x), where deg r<2 or r(x) =0. So 
the remainders look like a+ bx, with a,b €R. This means elements of R[x|/A have the 
form [a + bx] = [a] + [b][x] =a + b0, which we can identify with a complex number a + bi, 
for a,bER. 


Our Goal. Replace R in the preceding construction with a finite field Z,. Then replace x? + 1 
with any irreducible polynomial mod p, where irreducible means the analog of prime - a 
polynomial with no non-trivial factorization (see Definition 5.5.1). Then apply the result to 
error-correcting codes in Section 8.2. For example, take p = 2. Since x7 + 1=(x+ 1) in 
Z,\|x|, we know that x” + 1 is not an irreducible polynomial in Z, |]. An irreducible poly- 
nomial is our analog of a prime in the ring Z,[x]. An example of an irreducible polynomial 
in Z,[x] is x* + x + 1. Why? It has no roots mod 2 and thus cannot have degree 1 factors 
as we will show in Section 5.5 on polynomial rings. This will imply that Z[x|/ (2* + x + 1) 
is a field with four elements {[0], [1], [x], [x+1]} where [x]? + [a] + 1=0. 
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The following theorem says that every ideal in the ring of integers is principal. We will 
have a similar theorem later about polynomial rings like R[x] or Z, [x], p prime. 


Theorem 5.4.2 Any ideal A in the ring Z of integers is principal; that is, A= (n) =nZ 
for some n€ Z. In fact, if AR {0}, we can choose n to be the smallest positive element 
of A. 


Proof. The case A = {0} = (0) is clear. Otherwise AA {0} and we let n be the least positive 
element of A. Then (n)C A. Suppose that a€ A. The division algorithm says a=nq +17, 
with O<r<_n. Since A is an ideal and r= a — nq, we know that r€ A. But n is the least 
positive element of A. Therefore r= 0. This implies A Cc (n). So A= (n). A 


Exercise 5.4.3 In the preceding proof, why must the nonzero ideal A of Z have a positive 
element? 


Exercise 5.4.4 As in the preceding example constructing the complex numbers, show that 
the quotient ring Q|x|/ (x? — 2), where x is an indeterminate, can be identified with the ring 
Q| V2] consisting of numbers a+ b\/2, with a,b€ Q, under the usual sum and product of 
real numbers. 


In pursuit of our goal, we ask two questions that lead us to two definitions. 
Question 1. Suppose A is an ideal in R. When is the quotient ring R/A an integral domain? 


Answer. When A is a prime ideal, which is defined as follows. 


Definition 5.4.3 Suppose that A is an ideal in the ring R, a commutative ring with 


identity for multiplication. We say that the proper ideal A is a prime ideal in R iff, for 
a,bER, abe A ==> eithera orb cA. 


Lemma 5.4.1 Suppose that A is an ideal in the ring R, a commutative ring with identity for 
multiplication. Then R/A is an integral domain iff A is a prime ideal. 


Proof. First note that R/A automatically has all the usual properties of an integral domain 
except for the lack of zero divisors. It inherits these properties from R. For example, the 
identities for addition and multiplication are [0] and [1], respectively. We get a zero divisor 
in R/A iff there are a, b€ R such that [a] [b] = [0] but [a] [0] or [b]A [0]. This means abe A 
but a¢ A or b¢ A. That is equivalent to saying that A is not a prime ideal. A 


If Ris a commutative ring with identity for multiplication, then R is an integral domain 
iff (0) is a prime ideal. 


Example. Which nonzero ideals (n) in Z are prime ideals? The answer is the ideals (p) with 
pa prime. A 
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Proof. First note that ab € (n) = nZ is equivalent to saying n divides ab. 

If n is not a prime, then n= ab with 1<a,b<n. It follows that ab € (n), but n cannot 
divide either a or b. Thus (n) cannot be a prime ideal in Z. 

If p is a prime and ab € (p), then p divides ab. Euclid’s Lemma 1.5.1, tells us that then p 
must divide either a or b. Thus either a or b must be in (p) and (p) is a prime ideal. A 


But we really want the answer to the following question. 


Question 2. Suppose A is an ideal in R, a commutative ring with identity for multiplication. 
When is the quotient ring R/A a field? 


Answer. When A is a maximal ideal, which is defined as follows. 


Definition 5.4.4 Suppose A is an ideal in R, a commutative ring with identity for mul- 


tiplication. We say that the proper ideal A is a maximal ideal in R iff, for an ideal B of 
R, the containment AC BCR => either B=A or B=R. 


Lemma 5.4.2 Suppose A is an ideal in R, a commutative ring with identity for multiplica- 
tion. The quotient ring R/A is a field iff A is a maximal ideal in R. 


Proof. First note that R/A automatically inherits all the properties of a field from R except 
closure under inverse for nonzero elements - a property that R does not necessarily have. 
In particular, [0] =A is the identity for addition in R/A and [1] =1+A is the identity for 
multiplication. 

Suppose that R/A is a field. If B is an ideal such that AC BCR but BA A, then we need 
to show that B= R. Since BEA, there is an element x € B — A. This means [x|F [0] in R/A. 
Since R/A is a field, there exists [y] € R/A such that [2] [y] = [1]. This means xy — 1 € A. Thus 
1=-+y—u for some ué€ A. But then, because B is an ideal containing x and u, it follows 
that 1 € B. Therefore, for any r€ R, we have r=1-re Band B=R. Thus A is a maximal 
ideal. 

Now suppose that A is maximal. We need to show that R/A is a field. Suppose [2|£ [0] 
in R/A. We need to find [x]~'. Look at the ideal B generated by A and x. That is B= 
{u+rx|ucA,reR}= (A,x). Then A CBCR. We know that AFB. Since A is maximal, 
it follows that B= R. But then 1 € B. So 1=u + rx for someu € A,r€ R. This implies [7 [x] = 
[1]. So [r] = |x] ' and we are done. A 


Exercise 5.4.5 In the preceding proof, show that B={u+ rx | ue A, re R} is an ideal in 
R, assuming that A is an ideal in R. 


Example 1. Which ideals (n) in Z are maximal ideals? The answer is that the nonzero 
prime ideals in Z are the maximal ideals in Z. We know this since we proved finite integral 
domains are fields in Proposition 5.3.1. A 


Exercise 5.4.6 Prove the nonzero prime ideals in Z are the maximal ideals in Z directly by 
showing Z/nZ is a field iff n is prime. 
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Example 2. What are the maximal ideals in Z,.? 

First we show that all ideals A in Z,, are principal. To prove this, consider the corre- 
sponding ideal A in Z, which is A={me€Z| |[m] <A}. Here we use the coset notation 
[m] =m + 12Z. We leave it as an exercise to show that A is an ideal in Z. Now, we know 
any ideal in Z is principal. Thus A= (n) = nZ for some n€ Z. But then A = [n] Z12. Why? 

Next suppose that [u] is a unit in Z,,. Then we can show that [u][n]Z,. = 
[n] Z,2. For the fact that [r]=[u]~’ exists in Z,. implies [n]Z,,=[u]~' [ul] [n] 
Z12 C [un| Zi2. There is no problem seeing the reverse inclusion [un] Z12 C [n] Z12. This is a 
general fact about principal ideals, by the way. A 


Exercise 5.4.7 


(a) Show that if A is an ideal in Zy, then A={me€Z |[m] € A} is an ideal in Z. 
(b) Answer the “why?” at the end of the end of the first paragraph of Example 2. 


The units in Z1, are [1], [5] , [7], [11]. Of course the principal ideals generated by units 
are all of the ring. Moreover, elements a and ua, for a unit u, generate the same ideal. Now 
we can list all the ideals in Z,,. They are (dropping the [ ]): 


(0) = {0}, (1) = (5) =(7) =(11) =Zn, 

(2) = (10) =2Z4, ={0,2, 4, 6,8, 10 (mod 12)}, 
(3) = (9) = 3Z42 = {0, 3, 6, 9 (mod 12)}, 

(4) = (8) = 4Z12 = {0, 4, 8 (mod 12)}, 

(6) = 6Zy) = {0, 6 (mod 12)}. 


The poset diagram for the ideals in Z;2 under the relation C is in Figure 5.3. It follows 
from Figure 5.3 that the maximal ideals in Z1 are (2) and (3). Note the connection with 
the poset diagram for the divisors of 12 under the relation | of divisibility. 


(1) Figure 5.3 Poset diagram of the ideals in Z12 


ge 


(0) 
Exercise 5.4.8 Explain the equalities for the ideals of Z,,: for example, why is it that 
(8) = (4)? 
Exercise 5.4.9 Show that every ideal in the ring Z,, is principal. 


Exercise 5.4.10 Suppose that R is an integral domain. Show that, for a,b€R, we have 
aR = (a) = (b) =bR iff a= ub, where u is a unit in R. 
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Exercise 5.4.11 Find all the maximal ideals in Zg. Can you do the analogous exercise for 
the general case Z,,? 


Exercise 5.4.12 Draw the poset diagram for ideals in Z39. Which ideals are maximal? 


Our second method for the construction of rings is the ring analog of direct sum of 
groups. 


Definition 5.4.5 To create the direct sum R@® S of two rings R and S, we start with the 
Cartesian product R x S and define the ring operations componentwise. That is, for (a, b) 


and (r,s) ER x S, we define (a,b) + (7,s)=(a+r,b+) and (a, b)(r, s) = (ar, bs). 


Exercise 5.4.13 Show that the preceding definition makes R® S a ring. 


Exercise 5.4.14 Is Z, ®Z, a field, an integral domain? Same question for Zz @ Z3. 


Exercise 5.4.15 Find the characteristics of the following rings: Z.2®Z2, Zz, BZ4, Zz PZ. 


Exercise 5.4.16 Find a subring of R=Z® Z that is not an ideal. 


Hint. Look at S={(a,b) |a+ b is even}. 


The definitions of sum and product of ideals in the following exercise are basic to 
Dedekind’s approach to arithmetic in rings of algebraic integers like Z le™* |. when - in 
general — not every ideal is principal. 


Exercise 5.4.17 If A and B are ideals in a commutative ring R with identity for 
multiplication, define the sum A+B={a+ blac A,beB} and the product AB= 
{ Gi | ai€¢ A, bj € Bh. 


(a) Show that A + B and AB are ideals of R. 
(b) Show that A+ B=R implies AB= ANB. 


The moral of the preceding exercise is that one can do arithmetic with ideals. 


Exercise 5.4.18 Suppose R= Z. If A= (a) and B= (b), show that A+ B= (gcd(a, b)). If 
A+B=(1), show that AB= (ab) =AN B. 


Exercise 5.4.19 Suppose that R,S,T, Vare rings. Show that if R=T and S& V, thenR@ S= 
T@® V. Here = means that the rings are isomorphic. 


Exercise 5.4.20 Suppose that F is a field. Find all the ideals in F. 


Exercise 5.4.21 Suppose that R and S are rings. What are the ideals in the direct sum 
Re@S? 


Exercise 5.4.22 Find a prime ideal in Z@ Z which is not maximal. 
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5.5 Polynomial Rings 


Suppose that F is a field. We consider the ring F[x] of polynomials in one indeterminate, +, 
and coefficients in F. 

Beware: Do not confuse polynomials and functions. For example, in Z3|x] the polynomials 
f(x) =2? +441 and g(x) =2* +x +1 represent the same function even though the two 
polynomials are different. To see this, plug in the elements of Z;. 


This sort of thing has to happen, since the number of functions T: Z3; + Z3 is 3? = 27, 
while the ring Z; [2] is infinite. 

You probably did Exercise 5.2.7 showing that R[1] is a commutative ring with multiplica- 
tive identity if R is a commutative ring with multiplicative identity. If not, do that exercise 
now! The bad part is the associative law for multiplication. 

If R= Z;, we add and multiply as in the following examples. 

(a? + 244+ 2) + (22 +44 2)=2? + 27 +1 (since 3 =0 (mod 3) and 4=1 (mod 3)); 

(a? + 2xt 1): (2? + 2x) =9° + 2244 x? + 24, 
using the same facts about congruences mod 3 as we used for addition. We learned 
to do this in the dim dark past by making the following table, with all numbers 
computed mod 3. 


ee + Oe = 47 
re + dx 
28 + 47% + Ox 
rose oyF a. a 
we + 2 + 0 + 2 + Ax 


Recall that the units R* of a ring R are the invertible elements for multiplication in R. 
When R=Z, the only units are 1 and —1. When the ring is Z[2], it turns out the units are 
the same as for Z, as we saw in Section 5.2. We get the analogous result when Fis a field, 
that is: 


(F[x])* = F* = F — {0}. (5.8) 
The proof is the same as that in Section 5.2 for Z[z]. 
Exercise 5.5.1 Prove equation (5.8). 
Now we want to imitate what we said about the integers in Section 1.5. We will have 
analogs of primes, the division algorithm, the Euclidean algorithm, and the fundamental 
theorem of arithmetic for the ring F[x], where F is any field. Pretty amazing! 


Assumption. For the rest of this section F is a field and all polynomials are in F{.x]. 


We want to define the polynomial analog of prime. Before that, we should define poly- 
nomial divisors: we say that a polynomial g(x) divides a polynomial f(x) if there is some 
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polynomial h(x) such that f(x) = g(x)h(x). Then we also say that g(x) is a divisor of f(x). 
This is quite analogous to divisibility of integers. 


Definition 5.5.1 A polynomial f(x) of degree >0 is irreducible iff f(x) =g(x)h(x) for 


g,f € Fla] implies either g or h has degree 0. Otherwise the polynomial is reducible. 


Now we want to get rid of the units in a factorization as we did in Z by allowing only 
positive non-unit integers to be called primes - assuming they cannot be factored non- 
trivially. To get rid of units in Flx], we look at monic polynomials. 


Definition 5.5.2 A monic polynomial is a polynomial with leading coefficient (i.e., 


coefficient of the highest power of x) equal to 1. 


So a monic irreducible polynomial of degree > 0 is the analog of a prime in Fiz]. 


Examples: Irreducible Polynomials of Low Degree in Z,[1] 

Degree 1 polynomials: x, ++ 1. Both are irreducible. 

Degree 2 polynomials: x?, #7 +1=(4+1)*, 27+ 4=2(4+1), #2 +441. The first 
three polynomials are clearly reducible. What about x? +2+ 1? Does x or x +1 divide 
x’ +.x+ 1? The answer is: No! For we have x? + ++ 1=2(4+ 1) +1. This means that if 
we had x? ++ 1=.xq(x), then x would divide 1 =.q(x) — x(x + 1). But that is impossi- 
ble, as O = deg(1) = deg(x{ q(x) — x— 1}) > 1. A similar argument shows that ++ 1 cannot 
divide 7 +x+ 1. 

This means that x?-++ + 1 is the only irreducible polynomial of degree 2 in Z,[2]. 
Degree 3 polynomials with nonzero constant term: 241, P +741, P4+xr+4+1, 
x +47? + x+ 1. Which of these polynomials are irreducible? To answer this question more 
rapidly, it helps to know that x— a divides a polynomial f(x) iff f(a)=0. Here we are 
assuming a€Z, and f(x) € Z,[2]. This is Corollary 5.5.1 below. 

The polynomial f(x) of degree 3 will be reducible iff it has a factorization f(x) = g(x)h(x) 
with g(x), h(x) € Za|x], such that deggA 0 and deg hAO. But then 3 = deg g + deg h implies 
that either g or h has degree 1. This means that fis reducible iff f(a) =0 for some a€ Zz. 
We prove the general version of this result in our proof of Proposition 5.5.1. 

So we need to test 7 +1, P +441, 8 +2? +1, P +2? +44 1 for roots in Z,. How- 
ever, the only possible root is 1, since we have already eliminated the polynomials with 0 
as a root. The polynomials with an even number of terms will have 1 as a root in Z, [2]. 

This implies that there are only two irreducible degree 3 polynomials in Z,|,!: 


e+aet+iand 2 +27? +1. A 


Exercise 5.5.2 Find the degree 4 irreducible polynomials in Z,|x]. 


Exercise 5.5.3 Find the monic irreducible polynomials of degrees 1 and 2 in Z3|x]. 


In order to do the same things for rings of polynomials F[x], when F is a field, that we did 
for the ring Z of integers, we will need a division algorithm. The division algorithm works 
just as it did in high school or wherever it was introduced. In fact, it really works the same 
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way it did for the integers in elementary school (as in Section 1.5). It is important, however, 
that F is a field because we must divide by coefficients of the quotient polynomial. 


Example 1. In Z|2], we have the following computation. 


van +1 
Pteti)|etetAt+eP +4441 
rae ee 
e+e 
etext 
0 


As a result, we have + 2° +a? +2? +4+1=(2? +441) (2? + 1). The remainder 
is 0. A 


Example 2. In Z3|2], we have the following computation. 


2X +1 
2x +1 x +X +2 
De 
2x +2 
2x +1 
1 


This says x + x++2=(2x+1)(2r+1)+1. The remainder is 1. Note that we are 
definitely using the fact that Z; is a field and 2~'=2 (mod 3). That is, 2:2= 
1 (mod 3). A 


Theorem 5.5.1 (The Division Algorithm for Polynomial Rings). Suppose that F is a field. 
Given f(x) and g(x) € Flx] with g(x) not the zero polynomial, there are polynomials q(x) 
(the quotient) and r(x) (the remainder) in Flx] such that f(x) = g(x)q(x) + r(x) anddegr< 
deg g or r is the zero polynomial. 


Sketch of Proof (Induction on deg f). If deg g =0, the result is trivial, as g is then a unit 
in the ring F[/x]. So we assume deg g > 0 from now on. The result is clear if deg f< deg g as 
then we can take q=0 and r=/f. 

Induction step. Now assume the theorem true if deg f< m — 1. Suppose that fhas degree 


m and 
Sx) = bmx" +--+ and g(x)=anx"+---, with aA 0 and b,AO. 


We may assume m>n or we can take r=/ as we have said already. 
Then we start the process by choosing the first term of q(x) to be a, 'b,,+"~" so that 


h(x) = fla) — ay 'bma"™"g(2) 


has degree less than deg f= n. That is 
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a, Uae qe B83 
g(x) =anx" +++) | fx) =dnr"™ +--- 
Dmx" + --- 
0 h =lower degree polynomial than f 


This gets the induction going. The induction hypothesis allows us to divide h by g and we 
are done. A 


Exercise 5.5.4 Fill in the details of the preceding proof. 
Exercise 5.5.5 Prove the uniqueness of the polynomials q and r in the division algorithm. 


Corollary 5.5.1 Suppose that F is a field, f(x) € F[x], and a€F. Then f(a) =0 <=> f(x) = 
(x — a)q(x) for some q(x) € F|2]. 


Proof. => By the division algorithm, f(x) = (x — a)q(x) + r(x), where deg r< 1 or r=0. It 
follows that r must be a constant in F. Thus f(a) = (a — a)q(a) + r=r. If f(a) =0, then ris 
the 0 polynomial and f(x) = (x — a)q(x). 

<= This should be clear. A 


Corollary 5.5.2 Suppose fe Flix] and deg f=n. Then f has at most n roots in F counting 
multiplicity. This means that we count a not just once but k> 1 times if (x — a)* exactly 
divides f(x) (meaning that (x —a)* divides f(x) and (x — a)*+! does not divide f(x)). 


Proof. By the preceding corollary, f(a)=0 implies f(x) =(x—- a)q(x) and deg f=n= 
1 + deg q. Thus deg q=n— 1. So we finish the proof by induction on the degree of f A 


The following corollary is the polynomial analog of Theorem 5.4.2 on ideals in Z. The 
proof is essentially the same. 


Corollary 5.5.3 Every ideal in F[x] is principal, when F is a field. 


Proof. Let A be an ideal in Fix]. If A= {0} = (0), we are done. Otherwise, let f(x) be an 
element of A of minimal degree. Then we claim A= (f). To prove this, suppose h€ A. The 
division algorithm says there exist q, r€ F[x] such that h= qf+ 1, with deg r< deg for r=0. 
Thus r=h — qf€A since h, fe A. This contradicts the minimality of the degree of funless 
r=0. Then he (f) and A= (f). A 


The following corollary is an analog of the result from Section 5.3 that Z, is a field if 
and only if n is a prime. 


Corollary 5.5.4 (Irreducible Polynomials Give Rise to Fields). The following are equivalent 
in Flx|, when F=field: 


(1) (f(x)) is a maximal ideal; 
(2) f(x) € F[x] is irreducible; 
(3) Flal/(flx)) = field. 
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Proof. We know from Lemma 5.4.2 that (1) (3). So we will show (1)<=>(2). 

(1)==>-(2) Assume (f) is a maximal ideal. If f=g - h, for g, h € F[x], such that neither g nor h 
is a unit, then (f) C (g) C (1) = Fla] and (f) C (h) C (1) =F[2] and none of the inclusions 
are equality. Why? Recall Exercise 5.4.10 that said (a) = (b) <> b=au, for some unit u 
in Fix]. This contradicts the maximality of (f). 

(2)= (1) Suppose fis irreducible. We know by the preceding corollary that every ideal in 
Fix] is principal. So any ideal containing (f) must have the form (g), for some g€ F{x]. If 
(f) C (g) C Fix], then f=g-h for some he F{.]. But the irreducibility of f says that either 
g or h is a unit. If g is a unit then (g)= Fix]. If h is a unit, then (f)=(g). Thus (f) 
is maximal. A 


The following proposition has already been useful in searching for low degree irreducible 
polynomials. 


Proposition 5.5.1 (Irreducibility Test for Low Degree Polynomials). Suppose F is a 
field, f(x) € F[x| and degf=2 or 3. Then f is not irreducible iff Ace F such that 


f(c)=0. 


Proof. => We have a non-trivial factorization of f iff f= gh, for g, h € F|x|, where either g 
or h has degree 1. This is true since deg f= deg g + degh and2=1+1lor3=2+1=1+2 
are the only possibilities for partitions of 2 or 3 into sums of two integers. It follows that 
we can take one of the factors, say g(x), to be monic and linear: that is, g(x) =x-—c for 
some c € F. But then f(x) = (x — c)h(x) implies f(c) = 0. 

<= If f(c) =0, then x — c divides f(x), by Corollary 5.5.1. A 


Example. The preceding test fails for Ze[x] since, for example, f(x) = (2x + 1)? has no roots 
in Z.. Why is this not a contradiction to the proposition? A 


Exercise 5.5.6 In Z3[2] show that the polynomials f(x) =x'? — x and g(x) =x’ — x determine 
the same function mapping Z3 into Z3. 


Exercise 5.5.7 In Z,[x] find the quotient and remainder upon dividing f(x) = 52+ + 3274+ 1 
by g(x) =32" + 2x +1. 


Exercise 5.5.8 Find all degree 3 monic irreducible polynomials in Z3[x]. 


Integral domains with a division algorithm are called Euclidean domains. Thus the ring 
of polynomials over a field F is a Euclidean domain. As such it has similar properties to 
Z. One defines the greatest common divisor d= ged(f, g) for f,g¢ Fix], to be the unique 
monic polynomial which divides both f and g such that any common divisor h of f and g 
must divide d. Again there is a Euclidean algorithm to compute d. One has the analog of 
Bézout’s identity in (Theorem 1.5.2) which states that d=uf+ vg for some u,v € Fix], and 
the Euclidean algorithm can be used to find u and »v. 
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Exercise 5.5.9 Prove the preceding statements about gcd(f,g) for f,g € Flx] by imitating the 
proofs that worked in Z. Here we must make a slight change in the algorithm to insure that 
the final result is a monic polynomial. That is, we must multiply by the reciprocal of the 
lead coefficient of the last nonzero remainder to make it monic. 


Exercise 5.5.10 If F is a field, show that the ideal (a(x), b(x)) in Flx] is (gcd(a(x), b(x))). 


Exercise 5.5.11 


(a) Show that if an ideal A of ring R contains an element of the unit group R*, then A= R. 
(b) Show that the only ideals of a field F are {0} and F itself. 


Exercise 5.5.12 Find gcd (x? + 2x7 + 2x4 1,27 +227 + x+ 2) in Zy [a]. 


5.6 Quotients of Polynomial Rings 


Now we have enough information to use quotients of polynomial rings in order to obtain 
finite fields. We already considered a field of order 9 in Section 5.3. 


Example. A field with eight elements is Fy = Z,[x|/ (a? +x + 1). 

To see this, you just have to recall a few of the facts that we proved in preceding 
sections. First, we know from Section 5.5 that 2° + x +1 is irreducible in Z, [x]. Second, 
Corollary 5.5.4 says that Z,[x]/ (2? + x + 1) is a field. 

How do we know that our field has 2? elements? The answer is that a coset [f] in 
Z{x]|/ (a? + x+ 1) is represented by the remainder of f upon division by x? + x + 1. The 
remainder has degree < 2 and thus has the form ax* + br + c, where a, b, c € Z2. Moreover 
two polynomials g, h of degree < 2 cannot be congruent modulo x +.x+ 1 unless they are 
equal. Why? Congruent means the difference g — h is a multiple of x +x+ 1. The only 
way a polynomial of degree < 2, such as g — h, can be a multiple of a degree 3 polynomial 
is if g —h is really the 0 polynomial. Thus g must equal h. 

The preceding is analogous to what happens in Z1¢3. The elements [m] in Zi¢3 are 
represented by [7], where r is the remainder of m upon division by 163. 

We can set 9= [x] in Z)[x]/ (? +x+ 1). This means @ is a root of 2? +x+1=0. We 
are saying that F, = {a? + b0 + c | a,b,c € Z, }. Moreover, we can view F, as a vector 
space over the field Z2. A basis for this vector space is {1, 0,67}. See Section 7.1 for more 
information on vector spaces over fields. 

If we express the elements of Fg, in the form a6? + b@ +, for a,b,c € Zz, it is easy to 
add the elements but hard to multiply. Thus it is useful to show that the multiplicative 
group of IF, is cyclic - a fact that can be proved for any finite field. See Exercises 6.3.10 
and 7.5.9. It turns out the generator in this case is 6. It is not true that if our finite field is 
K=Z, [x|/ (k(x)), where p is prime and k(x) irreducible, then a root of k(x) generates K*. 
Only certain primitive polynomials k(x) over the base field Z, will have this property that 
a root of k(x) is a generator of (Zp |[x]/ (k(x)))’. 

Next we create a table of powers of 0=[x] in Z,|x]/ (a? +x+ 1). We know that 
6? +0+1=0. This implies that 6? = —0 —1=6+1 since —1 = +1 in Z). Then we note that 
(ag + 410+ 4267) = oO + 410? + 4207 =ao0 +4407 + a2(O + 1) =a + (do + a2)0 + 0467. 

So multiplication by 6 sends the coefficients (dp, a1, a2) to (d2, do + a, a1). This is what 
is called a “feedback shift register.” So now it is easy to make a table of powers of 6 
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(see Table 5.1). The jth row will list the coefficients (do, a1, a2) of 6 = ao + 410 + a6". To 
go from the jth row to the (j + 1)st row, send (do, @,, a2) to (@, dy + ay, a). 


Table 5.1 Powers of 6 for F, 
6) =a + a0 + a0" 


A ay a2 
6 0 1 0 
? 0 0 1 
& 1 1 0 
6* ) 1 1 
0° 1 1 1 
6& 1 0 1 
g7=1 1 () ) 


This shows that the multiplicative group of units F is a cyclic group of order 7 generated 
by 6. We call 2? + x+ 1 a primitive polynomial in Z,|x] for this reason. Figure 5.4 shows a 
picture of the feedback shift register corresponding to this polynomial. You can use primitive 
polynomials to construct other feedback shift registers. It is a finite state machine that will 
cycle through 2” — 1 states if f(x) is a primitive polynomial of degree n in Z, [x]. The states 
are really the rows of the table of powers of 0 for 6 a root of f(x), and this is really the 
multiplicative group of the finite field Z,[x|/(f{x)). The successive states of the registers 
are given in Table 5.1 for the example under consideration. A 


Figure 5.4 A feedback shift register diagram 
corresponding to the finite field 

Z,[x]/(° + «+ 1) and the multiplication 
table given in the text 


Feedback shift registers are of interest in generating pseudo-random numbers and in 
cryptography. There are applications in digital broadcasting, communications, and error- 
correcting codes. See the first two sections of Chapter 8. 


Exercise 5.6.1 


(a) Show that x? — 2 is an irreducible polynomial in Z,|x]. 

(b) Show that the factor ring Zs |x] /(x* — 2) is a field with 25 elements. 

(c) Show that the field in part (b) can be identified with Zs |/2]= 
{a + bV2| a,b EZs}, the smallest field containing Z; and \/2. Note that this is legal 
because the equation a*=2 has no solution ac Z,. Here we add and multiply as we 
would in R, except that everything is mod 5. 

(d) What is the characteristic of the field with 25 elements in parts (b) and (c)? 


Exercise 5.6.2 Identify Fy5 = Zs|x|/ (x? — 2) as the set Zs | /2] as in the preceding exercise. 
Set 6 = V2. Find the table of powers 6 =a) +410, where 0? =2, in a similar manner to 
the table we created for Z,|x|/ (x? +x +1). Do these powers give the whole unit group in 
Zs|x] /(1* — 2)? This would say that the polynomial x? — 2 is primitive in Z,|x|? What is 
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the feedback shift register diagram as in Figure 5.4 that corresponds to this polynomial? 
How many states does it cycle through before it repeats? 


Hint. The order of the unit group F5, is 24 while the order of 2 is 4 and thus the order of 
/2 is 8. The symbol for multiplication by c in a feedback shift register is a circle with c 
inside it on the arrow going into the register. 


Exercise 5.6.3 


(a) Show that the ideal (x) in Z3|x] is maximal. 
(b) Prove that Z3 is isomorphic to Z3|x|/(x). 


Exercise 5.6.4 


(a) Show that Z3|x]/(x* +x + 2) is a field Fy with nine elements which can be viewed as 
the field Z3|0| = {a + b6|a,b€Z;}, where & +6+2=0. 

(b) Imitating Table 5.1 in the example above, compute the powers of 0 from part (a). 

(c) Is the multiplicative group Fj in the field of part (a) cyclic? 

(d) Draw the corresponding feedback shift register diagram as in Figure 5.4. 


Exercise 5.6.5 Find an infinite set of polynomials f(x) € Z;[4] such that f(a) =0 for all 
ae Z3. 


Exercise 5.6.6 Suppose a is a nonzero element of a finite field F with n elements. Show that 
ge, 


Exercise 5.6.7 


(a) Consider Z.[i] = {a+ bi| a,b€Z.}, where we view i as some entity not contained in 
Zs, such that i? =—1. Show that this ring is not a field. 

(b) Consider Z7|i|={a+ bi| a,b€Z,7}, where we view i as some entity not contained in 
Zz such that i? =—1. Show that this ring is a field. 

(c) Can you develop a more general version of this problem for Z,|i] where p is an odd 
prime according to whether p is congruent to 1 or 3 (mod 4)? Here you will need to 
use a fact from number theory. The congruence c? =—1 (mod p) has a solution c iff 
p=1 (mod 4). 


Hint for (c). The fact that the group Z,, is cyclic is useful for the proof of that number theory 
fact. We prove this later (in Exercise 6.3.10 as well as Theorem 7.5.4). Once we know Z, 
is cyclic, then the main theorem on cyclic groups from Section 2.5 tells us that Z, can only 
have a subgroup of order 4 like the subgroup generated by i if 4 divides p — 1. 


Exercise 5.6.8 We use the notation R\a] for ring R and element a of some larger ring S to 
mean that R{a] is the smallest subring of S containing R and a - also known as the ring 
generated by a over R. Show that R{a] consists of all polynomials f(a), with f(x) € Rix]. 
Explain why in the preceding exercise you only need polynomials of degree <1. 


Exercise 5.6.9 Consider the quotient Z,|x|/(x* + 1). Is it a field? Why? 
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Exercise 5.6.10 Consider the quotient Z3[x|/(x* + 1). Is it a field? Why? 


Exercise 5.6.11 Describe the following rings. Are they fields? 


(a) Zz |x] / (2); 
(b) Z, [x] /(x+ 1); 
(c) Zz |x] / (x*). 


Exercise 5.6.12 Construct a field of with 49 elements. 


Rings: There’s More 


6.1 Ring Homomorphisms 


We have already discussed ring isomorphisms. Now you must agree that we need to consider 
the ring analog of group homomorphism from Section 3.5. 


Definition 6.1.1 Suppose that R and S are rings. Then a function T: R—+S is a ring 
homomorphism iff T preserves both ring operations: that is, 


T(a+b)=T(a)+T7(b) and T(ab)=T(a)T(b), for all a,bER. 


If, in addition, T is 1-1 and onto, we say that T is a ring isomorphism and write R& S. 
If the ring homomorphism T:R— R is 1-1 and onto, it is called a ring automorphism. 


As we said earlier, we will usually identify isomorphic rings, just as we can identify 
isomorphic groups. 


Example. 7:Z— Z, defined by 7(a) =a (mod n) is a ring homomorphism and is onto but 
not 1-1. This example is easily generalized to 7: R-> R/A, where A is any ideal in a ring 
Rand r(x) =4+A=[q]. A 


Application: Test for Divisibility by 3. Any integer has a decimal expansion which we 
write n= a,4,_,--+,d, for aj€ {0,1,2,3,4,5,6,7,8,9}, where this means that n= 
ap10* + ap_,108-!+---+a4,10+ ao. Then we see that 3 divides n= apag_1--- Qa iff 
3 divides a, + dyp_; +--+ +a, +o, the sum of the digits of n. 


Proof. Look at the homomorphism 7:Z—Z; defined by 7(a)=a (mod 3). Then, since 
m(10) = 1 (mod 3), we have 


n(n) =m (a,10* + a,_, 10%? + --- +4410 + ao) 
=m (ag) m (10)* + (a_i) @ (10)* | + --- + 4 (ay) (10) + 1 (ao). 
=a + ay, +--+ +a, +a (mod 3). A 


Example. Does 3 divide 314 159 265 358 979 323 846? We compute the sum of the digits to 
be 103 and then the sum of those digits is 4, which is not divisible by 3. So the answer 
is “No.” A 
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Exercise 6.1.1 Does 3 divide 271 828 182 845 904 523 536? 


Exercise 6.1.2 Create a similar test to see whether 11 divides a number? Then use your 
theorem to see whether 11 divides the numbers in the preceding example and exercise. 


The following theorem gives the basic facts about ring homomorphisms - facts analogous 
to those for group homomorphisms in Section 3.5. In particular, we define the kernel of a 
ring homomorphism and find an analog of the first isomorphism theorem. 


Theorem 6.1.1 (Properties of Ring Homomorphisms). Suppose that S and S' are rings 
and T:S—S' is a ring homomorphism. Let 0 be the identity for addition in S and 1 
the identity for multiplication in S, if S has an identity for multiplication. Let 0’ be the 
identity for addition in S’. Then we have the following facts. 


(a) 10) =0" 
(b) The image T(S) is a subring of S’. Here T(S) = {T(s) | s€ S}. If S has an identity 
for multiplication and T(1)A T(0), then T(1) is the identity for multiplication 
in the image ring T(S). 
(2) If S is a field, then T(1)A T(0) implies that the image T(S) is a field. 
(3) Define the kernel of T to be ker T= T_'(0') ={x€ S| T(x) =0'}. Then ker T is an 
ideal in S. Moreover T is 1-1 iff ker T= {0}. 
(4) (First Isomorphism Theorem) S/ker T= T(S). 


Proof. 


(1a) follows from the corresponding fact about groups in Section 3.5, since S and S’ are 
groups under addition. 

(1b) The image T(S) is an additive subgroup of S’ from results of Section 3.5. To finish the 
proof of this part, we must think a little since S, S’ are unlikely to be groups under 
multiplication. To see that T(S) is closed under multiplication, note that T(a)T(b) = 
T(ab) € T(S), for all a, b € S. As a subset of S’, the image T(S) is a ring, by the subring 
test. Then, if S has an identity for multiplication and T(1)A T(0), for a€ S, we have 


T(1)T(a) = T(1- a) = Ta), 
T(a)T(1) = T(a- 1) =T(a). 


It follows that T(a) is the identity for multiplication in T(S). 

If S is a field, from part (1) we know that T(1) is the identity for multiplication in 

T(S). Since S* = S — {0} is a group under multiplication, we can use results from 

Section 3.5 to see that T(S*) =T(S)* is a group under multiplication. 

(3) We know that ker T is an additive subgroup of S by results of Section 3.5. To show 
ker Tis an ideal we also need to show that if a € ker T and s € S, then sa and as are in 
ker T. This is easy since T(sa) = T(s)T(a) = T(s)O= 0 implies sa € ker T. The same sort 
of argument works for as. 

(4) We imitate the proof of the first isomorphism theorem for groups in Section 3.5. As 
before, we define a map T:S/kerT— 7(S) by setting T ({a]) = T(a + ker7) = (a). 
Then we need to show that T is a ring isomorphism. 
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T is well defined since if [a] =a+ ker T= [b], then b=a+u, where u€kerT. This 
implies 

T([b]) = T(b) =T(a+ u) = T(a) + T(u) 

=T(a) + O=T(a) =T((a]). 

Tis 1-1 since T((q)) =T( [b]) implies T(a) = T(b) and thus a — b€ kerT and [a] = [b]. 
T is onto since any element of 7(S) has the form T(a) for some aé€ S. Thus 
T ((a]) = T(a). 
T preserves both ring operations since, for any a,b € S, we have the following, using 
the definition of addition and multiplication in the quotient S/ ker T, the definition 
of T, and the fact that T is a ring homomorphism: 


T ({a] + [b]) =T([at b]) = Ta + b) = Ta) + T(b) =T ((a]) + T((b)), 


T((a] - [b]) = T([a- b]) = T(a- b) = Ta) - T(b) =F ((a}) - T (0). A 


Example. Define Z3[i] ={x + iy | x,y Z3}, where i? =—1. Here, for x,y,u,v€ Zs, we 
define (x + iy) + (u+ iv) =x +u+i(y+v) and (x+ iy): (ut iv) =2xu — yo +i(av t+ yu). 
We can use the first isomorphism theorem to show that 


Z[i] = Z,[2] / (x? + 1). 


Note that the right-hand side is a field because x” + 1 is irreducible in Z; [2x] since the fact 
that —1 is not a square mod 3 means x” + 1 has no roots in Z3. Here we use Corollary 5.5.4. 
The left-hand side can be shown directly to be a field by proving that it is possible to find 
the multiplicative inverse of any nonzero element. 

First we define a ring homomorphism T: Z3|x] > Z3[i] by T(f(x)) =f(i) for any poly- 
nomial f(x) € Z3|x]. The map T is well defined, preserves the ring operations, and is onto. 
For example, it is easily seen that 


T(f+ 9) =(F+ 9) =f) + 9) =TH) + Tg) 


and 


Tf 9)=(F- (i) =f) - 9) = TH) - TC). 


We claim ker T= (x? + 1). To see this, note first that (27 + 1) Cker T, since i? + 1=0. 
To prove ker Tc (4? + 1), let g(x) €kerT. By the division algorithm, we have g(x) = 
(x? + 1) q(x) + 1r(x), where deg r< 2 or r=0. Since g(i) = 0, we see that r(i) = 0. But if rA0, 
deg r=0 or 1, and we have r(x) = ax + b, with a,b € Z;. But then ai + b=0. If a0, this 
would mean i= —b/a€ Zs, a contradiction to the fact that —1 is not a square in Z;. Thus r 
must be the 0-polynomial and ker TC (x? + 1) to complete the proof that ker T= (x? + 1). 

It follows then from the first ring isomorphism theorem (which was part (4) of 
Theorem 6.1.1) that Z3[i] & Zs [a]/ (2? + 1). A 


Exercise 6.1.3 Suppose that R,S are rings, A is a subring of R. Let T:R—S be a ring 
homomorphism. 


(a) Show that for all r€R and all ne Z*, we have T(nr) =nT(r) and T(r') = T(r)". Here 
n- ied a a “+r 


(b) Show that if A is an ideal of R and T(R) = S, then T(A) is an ideal of S. 
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Exercise 6.1.4 Under the same hypotheses as in the preceding exercise, prove that if B is 
an ideal in S, then we have the following facts. 


(a) T~'(B) is an ideal of R. Here the inverse image of B under T is 
T~'(B)={a€A | T(a) € B}. 
Do not assume the inverse function of T exists. 


(b) If T is a ring isomorphism of R onto S, then the inverse function T—! is a ring 
isomorphism of S onto R. 


Exercise 6.1.5 Show that isomorphism of rings has the properties of an equivalence relation. 


Exercise 6.1.6 

(a) If we try to define T: Z; > Zo by setting T(x) =5x we do not really have a well-defined 
function. Explain. 

(b) Show that T:Z,—-Z 2 given by T(x)=3x is well defined but does not preserve 
multiplication. 

(c) Show that every homomorphism T: Z, > Z, has the form T(x) = ax for some fixed a in 
Z, with a? =a (mod n). Find a value of a that works when n= 12. 


Exercise 6.1.7 

(a) Show that the factor ring R{x|/ (x? + 1) is isomorphic to the ring of complex numbers 
C. Here R|x] is the ring of polynomials in one indeterminate x and real coefficients. 

(b) Show that complex conjugation (a + ib) =a — ib, for a,b in R and i? = —1, defines a 
ring automorphism of C. 

(c) Show that C is not isomorphic to R. 

(d) Show that C is isomorphic to the ring { ( vi ;) | a,be RI, where the operations are 


| 
the usual matrix addition and multiplication. 


Exercise 6.1.8 

(a) Show that the only ring automorphisms of Q are the map T(x) =0, Vx EQ, and the 
identity I(x) =x, Vx €Q. Hint. First look at the map on Z. 

(b) Show that the only ring automorphism of R is the identity map. Hint. First, recall that 
the positive reals are squares of nonzero reals and vice versa. Then recall that a< b 
<=> b—a>0. Use this to show that a<b implies ¢(a) < ¢(b). Then suppose that Sa 
such that ¢(a)Aa. Consider the two cases that a < $(a) and ¢(a) < a. There is a rational 
number between a and (a). Use the fact that ¢ must be the identity on the rationals to 
get a contradiction. 


Finally we have an exercise which is the finite characteristic analog of Exercise 5.3.12. 


Exercise 6.1.9 Suppose that F is a field of prime characteristic p. Show that F contains a 
subfield isomorphic to Z,. 


Hint. Define a mapping T:Z—F by setting to a cau +1. Set T(0)=0 and 
—- —— 


n 
T(—n)=-—T(n), for n€Z*. Show that T preserves the operations of addition and 
multiplication and induces a field isomorphism from Z, onto a subfield of F. 
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Exercise 6.1.10 (Second Isomorphism Theorem for Rings). Suppose that R is a ring, A is a 
subring of R, and B is an ideal of R. Show that 


A+B={a+b|acA,beB} 
is a subring of R and Af B is an ideal of R. Then show that 


(A+ B)/BSA/ANB. 
There are also third and fourth isomorphism theorems. See Dummit and Foote [28, p. 246]. 


Exercise 6.1.11 Find two ring automorphisms of Z;[i] = {x + iy | x,y € Zs}, where i? =—1, 
that fix elements of Z3. 


Exercise 6.1.12 Show that the only ring automorphism of Z is the identity map I(x) = 
for all x€ Z. 


Exercise 6.1.13 Consider the ring tires of 2 x 2 matrices with entries in Zy for prime p. The 
ring operations are componentwise addition and matrix multiplication as in Exercise 5.2.1. 


b 
Define T: Z5** Zp by (2 ) =a+d. Is this a ring homomorphism? Why? 


d 


Exercise 6.1.14 We can define a mapping T: Z24 > Z 2 by T (x (mod 24)) =x mod 12. Show 
that this map is a ring homomorphism. Find the kernel. Is it onto? What does the first ring 
isomorphism say? 


6.2 The Chinese Remainder Theorem 


An example of the Chinese remainder theorem can be found in a manuscript by Sun-Tzu 
from the third century ap. 


Theorem 6.2.1 (The Chinese Remainder Theorem for Rings). Assume that the positive 
integers m,n satisfy gcd(m, n) = 1. The mapping T:Zmn +Zm ® Zn defined by T((s ))= 
(s (mod m),s (mod n)) is a ring isomorphism, showing that Zmn is isomorphic to the 
ring Zm © Zn. 


Proof. First consider the mapping T: Z—> Z» © Z, defined by T(s) = (s (mod m), s(mod n)). 
Then T is a ring homomorphism. Note that Exercise 3.6.3 showed that it is an additive 
group homomorphism. To see that it preserves multiplication, let a, b€ Z. Then, using the 
definition of T and the definition of multiplication in Z,, © Z,, we have: 


T(a-b)=(a- b (mod m),a-b (mod n)) 
=(a (mod m),a (mod n)) - (b (mod m), b (mod n)) = T(a) - T(b). 

It follows from the first ring isomorphism theorem (which was part (4) of Theorem 6.1.1) 
that T:Z/ker T— T(Z) defined by T (x + ker T) = T(x), for x€ Z, is an isomorphism. So we 
need to compute the kernel of T. This is, since gced(m, n) = 1, 

ker T= {a €Z|a=0 (mod m) and a=0 (mod n)} 
= {a €Z|m divides a and n divides a} = mnZ. 
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This map Tis 1-1. Thus the image of T must have mn elements. It follows that T must map 
Zim, Onto Z, © Z, because Z,, ® Z, has mn elements. A 


In particular, the Chinese Remainder theorem says that if gcd(m, n) = 1, there is a solution 
x€Z to the simultaneous congruences: 


x 


a (mod m), 
x=b (mod n). 


When the theorem is discussed in elementary number theory books, only the onto-ness 
of the function is emphasized. Many examples like the following are given. This result 
and its generalizations have many applications: for example, to rapid and high-precision 
computer arithmetic. See the next section or Dornhoff and Hohn [25], Knuth [54], Richards 
[90], Rosen [91], or Terras [116]. 


Example. Solve the following simultaneous linear congruences for +: 


3x=1 (mod 5), 
2x =3 (mod 7). 


The first congruence has the solution x = 2 (mod 5), as one can find by trial and error. Then 
plug +=2 + 5u into the second congruence to get 


2x =2(2+5u) =3 (mod 7) and thus 4 + 10u=3 (mod 7). 


This becomes 3u=6 (mod 7). One immediately sees a solution u=2 (mod 7). This means 
that u =2 + 7t. Plug this back into our formula for x and get x= 2+ 5(24 7t) =12+4 35t. 
This means x=12 (mod 35). You should check that x solves the original simultaneous 
congruences. A 


There are many ways to understand the Chinese remainder theorem. The first step is 
to extend it to an arbitrary number of relatively prime moduli. If the positive integers mj; 
satisfy gcd(m,,...,m,) =1, and m=m,m,---m,, then the rings Z, and Zn, ®--- © Zn, 
are isomorphic under the mapping f(x mod m) = (x mod mj,...,x mod m,). We leave the 
proof of this as an exercise. 


Exercise 6.2.1 Show that if we assume that the positive integers mj satisfy 
ged(m,..., mr) =1, and m= mm, ---m,, then the rings Zm and Zm, © --: © Zm, are iso- 
morphic under the mapping f(x mod m) = (x mod m,..., x mod m,). Recall Exercise 3.1.6. 


We want to look at the case r=2 again. To create the isomorphism between Z)5 and 
Z3 ® Zs, for example, you can make a big table with its rows and columns indexed by the 
positive integers, as in Table 6.1. 

Next we fill in the blanks in the upper left 3 x 5 part of the table by moving the first 
number on the diagonal of the big table which is left out of that upper 3 x 5 part of the 
table. That number is 4. Move it up three rows. Similarly we move 5 up three rows. The next 
number 6 must be moved up three rows and then moved left five columns. Equivalently 
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Table 6.1 
1 2|3 |4 5 1 | 2 3 | 4 5 1 2 3 4 5 
1 


10 


11 


12 


13 


14 


WI DM] RL WwW] DM] Rl Wi dD] Re] Ww) dO] RP) WwW) dD], 
oe) 


15 


put 6 diagonally down from 5 next to the smaller grid below and then move it left five 
columns. This produces the following table. 


| 1|2/3 5 
| 1 1 

| 2 2 5 
| 3 || 6 3 


Continue in this way to complete the table which embodies the Chinese remainder theorem 
for the case Z3 @ Zs. 


1 2 3 4 5 


1 |) 1 7 13 | 4 10 
2 |} 11 | 2 8 14 | 5 
3 || 6 12 | 3 9 15 


Since the isomorphism preserves addition and multiplication we can compute mod 15 by 
computing mod 3 and mod 5. This is not so impressive but it would work better if we took 
a lot of primes like 


Se a 2 11 213s 17 19 99 2 29 51 97 A <9 2 


which is > 3- 107’. Then computing mod m would be the same as computing mod mi, for 
mM, =2°,m, =33,...,m,, = 47. See Domhoff and Hohn [25, p. 238] for more information. 

In another visualization, we see the additive group Zs as a discrete circle by taking the 
Cayley graph X(Z 5, {+1 (mod 15)}). See Figure 6.1. Now we can view the same group as 
a two-dimensional product of the circles Z3; and Zs. This is a torus or doughnut graph. It is 
also the Cayley graph X(Z,., {5, 6,9, 10 (mod 15)}) in Figure 6.2. 
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Figure 6.1 The Cayley graph 
X(Zis, {+1 (mod 15) }) 


Figure 6.2 The Cayley graph 
X(Z15, {5, 6, 9, 10 (mod 15) }) 


Exercise 6.2.2 Draw analogous figures to Figures 6.1 and 6.2 for Z35. 


As we noted above, the most important thing for a number theorist is the onto- 
ness of the isomorphism T: Zinn > Zm © Zy for gced(m,n) =1, defined by T([s]) = 7(s) = 
(s (mod m), s (mod n)). We want to discuss an old method to give an explicit formula for 
the solution of the simultaneous congruences that expresses this onto-ness. For example, 
suppose that m =3 and n= 5, and we want to solve 


a (mod 3), 
b (mod 4 (e-1) 


x 
x 


Then we first solve two sets of simultaneous congruences: 
5u=1(mod3) and 3v=1(mod 5). 


Then set x= 5au+ 3bv (mod 15). It is easily checked that x does solve the problem of 
formula (6.1). The method is preferable if you want a formula for the answer. 
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Exercise 6.2.3 Find an analogous procedure to that of the last paragraph to solve the 
simultaneous congruences: 


What is the solution when a=2,b=4,c=1? 


Hint. Let x=35ta+21ub+15vc, where 35t=1(mod 3),21u=1(mod 5), and 15v= 
1(mod 7). 


Exercise 6.2.4 Use the Chinese remainder theorem to finish the proof of equation (2.4) from 
Section 2.3 for the Euler phi-function. 


There are also applications of the Chinese remainder theorem in RSA cryptography (as we 
have seen), secret sharing, the fast Fourier transform, and proving the Gédel incompleteness 
theorem in logic. 

There are many puzzles related to the Chinese remainder theorem. Some are ancient. 
The following is a puzzle found in Ore [84, pp. 118ff]. 


An old woman goes to market and a horse steps on her basket and crushes the eggs. 
The rider offers to pay for the damage and asks her how many eggs she had brought. 
She does not remember the exact number, but when she had taken them out 2 at a 
time there was 1 egg left. The same happened when she picked them out 3, 4, 5, and 
6 at a time, but when she took them out 7 at a time they came out even. What is the 
smallest number of eggs she could have had? 


Exercise 6.2.5 Solve the preceding puzzle. 


Exercise 6.2.6 Suppose the 2 x 2 integer matrix M has determinant d such that gcd(d,n) = 
1. Show that we can solve the simultaneous congruences represented by the vector 
congruence Mv =a (mod n). 


Hint. Use the inverse matrix (mod n). 
Exercise 6.2.7 Generalize the Chinese remainder theorem to a commutative ring R with 


identity for multiplication and ideals A and B such that A+ B= R, AF {0}, BE {0}. Show 
that 


R/ (ANB) &(R/A) © (R/B). 
Exercise 6.2.8 Show that Z3|x]/ (x? — 1) =Z3 ® Zs. 


Exercise 6.2.9 Suppose F is a field. Consider two rings. The first ring is R=F°*? under 
componentwise addition and matrix multiplication. The second ring is S = F°*” under com- 
ponentwise addition and componentwise multiplication. Are these two rings isomorphic? 
Give a reason for your answer. 
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Exercise 6.2.10 Nineteen bears have only 14 berry baskets. The first 13 berry baskets have 
an equal number of berries and the 14th basket has 3 berries. What is the least number 
of berries present in total if we know that should they be able to put the berries into 19 
baskets, then each basket would have an equal number of berries? 


Exercise 6.2.11 Suppose you have some beads in a jar and you know that when you take 
them out three at a time you have two left, but when you take them out five at a time you 
have four left, and finally when you take them out seven at a time you have six left. How 
many beads are in the jar? 


Exercise 6.2.12 Suppose that F is a field and f(x), g(x) €F [x] with gcd(f(x), g(x)) = 1. Show 
that F[a| / (fg) (x)) = (F lal / fz) ® (F la] / (9). 


Exercise 6.2.13 Show that F [x] / (x* — 1) =F F, for any field F. 


6.3 More Stories about F[x] Including Comparisons with Z 


Suppose that F is a field and Fix] is the ring of polynomials in one indeterminate with 
coefficients in F. We have made a table of comparisons between the ring of integers Z and 
the ring F[2] (see Table 6.2). Most of the facts about F[.x] stated in the table were proved in 
Section 5.5. We will make the rest exercises. 


Exercise 6.3.1 

(a) Compute h(x) =ged(x? + 1, 2% +2? +2? + 1) in Z, [a]. Find polynomials u(x) and v(x) 
in Z,|x] such that h(x) =u(x) (@? + 1) 4+ v(a) (474+ 4+ 2 +1). 

(b) Factor x? + 1 as a product of monic irreducible polynomials in Z7 |x]. 

(c) Write x* + 2° + x’ +1 as a product of monic irreducible polynomials in Z|]. 


Exercise 6.3.2 Prove the analog of Euclid’s Lemma 1.5.1 for F\x|, where F is a field. Then 
extend the result to the case that a monic irreducible polynomial divides a product of n 
polynomials. 


Exercise 6.3.3 Assume that F is a field. Prove that every nonzero polynomial f(x) € F\2| 
factors as 


f(x) = unit - pi(x)--- p(x), 


where a unit is an element of F — {0} and, for each i, p;(x) is a monic irreducible with 
deg pi > 1. Then prove that this factorization is unique up to order. 


Exercise 6.3.4 Suppose F is a field. Show that the ring of polynomials F[x| is isomorphic 
to the ring V consisting of infinite sequences {In} ado where x, € F, for all n and all but a 
finite number of x, are 0. Addition and multiplication in V are given by the formulas: 


{An} + {Yn} = {an + Yn}, 
{anf + {Yn} = {Zn}, 
with z,= y XV 


jt+k=n 
O< j,k<n 
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Table 6.2 Comparing Z and F\x| 


Property Z Fix] 
infinite ring yes yes 
integral domain yes yes 
unit group {1,—1} F* = F— {0} 
division algorithm n=mq+r,0<r<m F(x) = g(a)q(x) +70), 
r=0 or degr<degg 
divisibility mn >n=mgq, forsomeqeZ g(x) |f\r) — > fx) =9(x)qQ), 
forsome q(x) € Flt] 
prime p> 1st. p=a-b=> f(x) monic irreducible, 
either a or b is a unit in Z deg f(x) >1 


f=g-h implies either 
g or his a unit in Fiz] 
io 
fx) =unit - pi(2) --- pila), 
pi(x) monic irreducible 
deg p; > 1 
factorization unique up to order 
A= (f), felement of A 
of least degree if AA {0} 
A= (f(a)) , f(2) irreducible 
deg (f) > 1 
Fix|/ (fle) = field 
when f irreducible, deg f> 1 
yes, Euclidean algorithm works 


unique factorization 
into primes 


nf O—> n= (£1) pipr---pr, 
pi prime, factorization 
unique up to order 


every ideal A principal A= (n),n least positive 
element of A, if AZ {0} 


A= (p), p prime 


maximal ideal 


R/A= field 
when A maximal 


Z/pZ = field when p is prime 


Euclidean algorithm for 
gced(a, b) =na + mb 
(Bézout’s identity) 
Euclid’s lemma 


yes, Euclidean algorithm works 


f(x) irreducible, f{x)|a(x)b (x) 
= S)la(z)_ or fl2)lb@) 


pprime, plab= pla or plb 


The slightly frightening moral of the preceding exercise is that you do not really need 
the powers of the indeterminate x when computing with polynomials in F[x]. Fraleigh [32, 
pp. 240-241] addresses the matter, saying: “Why carry x around when you do not even 
need it? Mathematicians have simply become used to it, that is why.” He also notes that 
replacing x with the sequence 


{0,1,0,0,0,...} 


would be pretty annoying to many - including x. Then he asks us not to fuss too much 
about what + really is. So we call it an “indeterminate.” Moreover we resist calling x a 
variable - for as we noted earlier - a polynomial is not a function - particularly over a 
finite field. 

Next we want to consider our favorite fact about F[x] and its quotients - a fact that 
allows us to construct all finite fields. Recall that firreducible implies deg f> 0. 


Proposition 6.3.1 (A Field with p" Elements, p Prime). If p is prime and f(x) is an irre- 
ducible polynomial in Z,|x|, deg f>1, then the quotient Fy» = Z,|x]/ (f(x)) is a field with 
p" elements, where n= deg f. 
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Proof. Since f is irreducible, the ideal (f) is maximal and thus F{x]/ (f(x)) is a field by 
Corollary 5.5.4. To see that this field has p” elements, we just need to see that the ele- 
ments of Z,|[x]/ (f(x)) are represented by the remainders of h¢ Z,[x] upon division by 
f These remainders have the form r(x) = d,_,24""' +--- + a,x + do, where a; €Z,. More- 
over, the cosets in Z,|x|/ (f(x)) of two distinct remainders cannot be the same. Otherwise 
f would divide the difference of the remainders. But deg f is greater than the degree of 
the difference of two remainders. Contradiction. How many such remainders are there? 
There are p possibilities for each a;. Thus there are p" remainders and thus p" elements of 


Z,|x]/ (f(x))- - 


We have already considered the following example in Section 5.3, but it will not hurt 
too much to reconsider it, given our accumulated knowledge. 


Example: A field with nine elements. Z3[x]/ (2* + 1)=Fo. 

If we view Z; as an analog of the real numbers, this field may be viewed a a finite analog 
of the complex numbers. From the preceding proposition, we know that Z3[2]/ (2? + 1) is 
a field with nine elements, once we know that x” + 1 is irreducible in Z; [x]. From Proposi- 
tion 5.5.1, we know that x” + 1 is irreducible iff it has no roots in Z3. Consider f(x) =2* + 1 
and plug in the elements of Z3. You get f(0) = 1, f(1) =f({—1) = 2. Thus x’ + 1 is irreducible 
in Z[2]. 

The field Z;[x] / (x? + 1) is isomorphic to Z; |i] = {a + bi | a, b€ Z;}, which is the smallest 
field containing Z3 and i such that i? = —1. Perhaps we should use some other letter than i 
here. We are not talking about the complex number i. The letter ijust stands for something 
not in Z3 such that i? = —1. We showed in the preceding paragraph that no element of Z3 
satisfies the equation for i. To prove that Z; [x]/ (x? + 1) = Z;|[i], you can define an onto map 
T: Z3 |x] > Zi] by T(f{x)) =fli). Then ker T= (x? + 1) since if g €ker T, we have g(x) = 
q(x) (x? + 1) + r(x), where deg r(x) < 2 orr=0. So0=g(i) =r(i) means r=O, and then ge 
(x? + 1). Conversely any element of (x? + 1) must be in ker T. Thus, the first isomorphism 
theorem says Z3[x]/ (x* + 1) is isomorphic to Z;[i]. Here we really identify the coset [2] in 
Z,\|x] / (x* + 1) with i. That is quite analogous to what happened when C was created as 
R[i] in Section 5.4. A 


The Quadratic Formula for Z,,, p > 2 


Next we consider Z,, for p prime, to be an analog of the real numbers R and we ask: is there 
an analog of the quadratic formula? So consider the quadratic equation ar + br+c=0 
for a, b, ce Z, and a£ 0. Now we do the Z, analog of completing the square, as long as 
DF 2. We can divide by a since Z, is a field and obtain 

—c 


,, d 
r+—r=—. 
a a 


Then for pA 2 we can add (2 to both sides and get 


iw bE \) _ e b\? 
r+ -f+ = + , 
a 2a a 2a 
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Now the left-hand side is a square and we have 
4? _ ore > -4ac + 0? 
2a a 2a 4a2— 
Define the discriminant D= b* — 4ac and obtain 
(+3) =< 
2a) (2a)? 


So we take square roots of both sides and note that we may have to go to a larger field 
than Z, to find VD. This gives: 


_=b= VD 
Tae 
This is the “same” quadratic formula that you may be familiar with from high school. 
We have two cases (again assuming p > 2): 


(1) If VDe Z, then reé Z,. 
(2) If VD¢ Zy, we can view ras an element of the field 


Zp[V D] &Zp[x]/ (x? — D) =F p, 
which is our analog of the complex numbers. 


For the real numbers we also had two cases: 


Case 1. D=b? — 4ac > 0 => roots real. 
Case 2. D=b? — 4ac <0 => roots complex and not real. 


Exercise 6.3.5 Explain the preceding cases for the real numbers R and then produce an 
analogous result for the rational numbers Q. 


You may be wondering about the case p = 2. When p = 2, the quadratic formula does not 
make sense as 1/2 makes no sense in Zp. 

There are many other facts that you learned in high school or college that are just as true 
in “any” field. For example, in Section 7.1 we recall a bit of linear algebra, just to make 
sure that you believe it works for any field as well as for the field of real numbers. 


Exercise 6.3.6 

(a) Find all roots of f(x) =3x* +x+4 in Z, by the process of substituting all elements 
of Z7. 

(b) Find all roots of the polynomial f(x) in part (a) using the quadratic formula for Z7. 
Do your answers agree? Should they? 

(c) Same as (a) for g(x) =2? + x4+ 4 over Zs. 

(d) Same as (b) for g(x) in part (c). 

(e) Same as (a) for h(x) =2? —3x+2. 

(f) Same as (b) for h(x) =2° — 3x +2. 


202 


Part II Rings 


Exercise 6.3.7 Suppose that Dé Z* is not a square: that is, DAn’, for any ne Z. Set 
Q |vD| ={x+yvD |z,yeQ}. Show that Q{VD] is a field. 


Exercise 6.3.8 Using the definition in the preceding exercise, show that the mapping 
T:Q [V7] > Q[V11] defined by T(x+y/7)=x+ yVI1I, for all x,yEQ is not a ring 
isomorphism. 


Exercise 6.3.9 Show that, using the notation of the preceding exercises - with Q replaced 


by R - we have R [/—1] =R [| V—7] =C. 
The following exercise is an important theorem. Do not skip it! 


Exercise 6.3.10 Show that, for prime p, the multiplicative group Zy is cyclic. 


Hint. Recall Exercise 3.6.17 which states that if G is an Abelian group, and g, he G, there 
is an element of G of order the least common multiple lem||g| ,|h|]. This implies that if g 
has maximal order r in G then x’ =e, the identity, for all xe G. Then use the fact that a 
polynomial of degree d over a field F has at most d roots in F. 


Exercise 6.3.11 (Euler's Criterion). Show that if p is a prime > 2, and ac Z5, then a=b? 
for some be Z, iff a =1 (mod p). 


Exercise 6.3.12 Suppose that F, is the finite field with a prime number p of elements. 
Suppose that A and B are non-squares in F,,. Show that F,,{/A] =F,,[ VB], again using the 
notation of the previous exercises, with R and Q replaced with Fy. 


Hint. A=C?B for some CEF>. 


Exercise 6.3.13 


(a) Assume p is prime. Show that there are (p — 1)/2 irreducible polynomials of the form 
fia =r — bin Z, [a]. 
(b) Show that for every prime p, there exists a field with p? elements. 


There is actually a formula for the number of irreducible polynomials of degree d over 
Z,y or any finite field. See Dornhoff and Hohn [25, p. 377]. 


6.4 Field of Fractions or Quotients 


You might argue that this section could or should have appeared in Chapter 1. The idea of 
the field of fractions (or quotients) generalizes the idea from elementary school that created 
the rational numbers Q from the integers Z. It also generalizes the construction of the field 
of rational functions 


fx), g(x) € Fla], g(2)A of 
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from the ring of polynomials Fx] over a field F. Recall that we need to identify the fractions: 


+=2= = or }=. You need to recall how to add and multiply them too: 7 + 7= 
Mays =, 2.2= 1° The same sort of things happen with fractions of polynomials. You 


need to identify things like 
1 +1 
x P4+Kx 
You also need to add them 
1 1 x-1 Xx 2x —1 


ed Px P-x P-X 
Once you remember this, you should be able to generalize the idea to any integral domain. 
Thus you would produce the following definition, state the next theorem, and do the ensuing 
exercises. These things are worked out in detail in many references such as [9], sometimes 
just for Z. 


Definition 6.4.1 Suppose that D is an integral domain. Then we can construct the field 


of fractions or quotients F by first creating a set S whose elements are the symbols 
a 


where a,b€D and bA0. An equivalence relation on S is given by saying 5 ~ S iff 


2 
b’ b 


ad= bc. Then define F to be the set of equivalence classes [F of S. Addition is defined by 


s+ lal-[*w | 


and multiplication is defined by 


sl Lal = [eal 

bild} Lbdl’ 

for a,b,c,d€ D. In these definitions of addition and multiplication, we always assume 
bd-£0. 


Theorem 6.4.1 The preceding definition creates a field F which contains a subring 
isomorphic to D. 


We consign the proof of this theorem to the following exercises. 


Exercise 6.4.1 Prove the following statements. 


(a) The relation ~ in Definition 6.4.1 is indeed an equivalence relation. 
(b) The definitions of addition and multiplication of equivalence classes for the relation in 
part (a) are independent of representative. 


Exercise 6.4.2 Prove the following claims. 


(a) The set F satisfies the field axioms. 
(b) The subring isomorphic to D consists of classes [2], ace D. 


The earliest known use of fractions (according to the all-wise internet) goes back to 
2800 sc in India (the Indus valley). 
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Exercise 6.4.3 Is is possible to create a field containing a subring isomorphic to Ze or more 
generally any ring with zero divisors? 


Exercise 6.4.4 Suppose an integral domain D is a subring of the field E. Show that the field 
of fractions of D is isomorphic to a subfield of E. 


Exercise 6.4.5 What is the field of fractions of Zs? 


Exercise 6.4.6 Show that the field of fractions of an integral domain is unique up to 
isomorphism. 


Suppose that, instead of Z, we start with F[m,...,+,], where F is a field. Here 
F[x1,...,4%n] denotes the ring of polynomials in n indeterminates. This ring was consid- 
ered in Exercise 5.2.15. Then the field of fractions for F [x,,...,%,| is the field of rational 
functions in several indeterminates: 


Bite Xe) 
PO ee aay? 


where g (4,,...,%,) and h(x,,...,%,) €Flx,...,%,], with hA0. 


We use the notation F (1,...,+,) for the corresponding field of rational functions. 

We should perhaps mention a big theorem that intertwines group theory and polyno- 
mials. As for polynomials, elements o of the symmetric group S, act on rational func- 
tions f(2,,...,%,) by (of) (41,---,%n) =f (Xoa)s-+++ Xo(n)). We defined the elementary 
symmetric polynomials in Section 3.7: 


81 (41,.--, Xn) = 41 $42 +++ + 4H, 
So (41, .--, Xn) =X + 1%y +--+ +My t = S° Xj, 
1<i<j<n 
, (6.2) 
Sis e5 ky) = S- Xj, Xi, Xi, 
1<i <in<---<ip<n 
Sn =2X142°°+ Xn. 
The symmetric group acts on the rational functions in F (11,..., 2%). Then one can show 
that symmetric rational functions are rational functions of the elementary symmetric poly- 
nomials. Moreover, one finds that S,, is the Galois group of F (1,,...,%,) over F(s,,...,Sn)- 


We will not consider such Galois groups here, but see Herstein [42] or Dummit and Foote 
[28]. With a little more effort, one can prove the fundamental theorem on symmetric poly- 
nomials - which says that any symmetric polynomial can be expressed as a polynomial 
in the elementary symmetric polynomials. (It is a starred exercise in Herstein [42].) Thus, 
for example, when n= 2, we have x? + 23 = (4, +42)? — 22,4, =s} — 2s. 


Exercise 6.4.7 Write the following symmetric polynomials in the indeterminates x, and x, 
as polynomials in the elementary symmetric polynomials s; =x, + x2 and s, =x x2. 


(a) x3 +43, 
(bo) 24 +24. 


Rings: There’s More 


It is also possible to create rings with smaller sets of denominators than the set of nonzero 
elements. Rings of the sort created in the following exercise have been very useful in number 
theory. See the book by Samuel [97, Chapter V]. 


Exercise 6.4.8 (The Localization of Z at a Prime p). If p is a prime, let Zip) denote the subset 
of Q consisting of fractions ™, with m,n€ Z, ged(m, n) = 1, such that p does not divide n. 
Show that Z(p) is a subring of Q. Then show that the nonzero ideals of Z(,) have the form 
(p"), n= 1,2,3,... 


There are other sorts of rings of fractions that have been useful in number theory. 
Instead of considering fractions such as those in the preceding exercise, one can consider 
*, with m,n € Z, nonzero n, gced(m, n) = 1, such that p does divide n - denominators in the 
complement of the set of denominators in the localization. All you need for the possible 
denominators is that they form a closed set under multiplication (and, of course, do not 
contain 0). See Ribenboim [89, Chapter 12]. S-integers in rings of algebraic integers like 
Z, [ern | appear in recent work on the Stark conjectures. A somewhat expository paper is 
that of Stark [111]. The object of such creations as the S-integers seems to be to kill off a 
number of prime ideals in order to go from unique factorization into prime ideals in the 
ring of integers of an algebraic number field to unique factorization into prime numbers - 
simplifying many computations. 


Exercise 6.4.9 We defined an ordered integral domain in Section 1.3. Suppose that F is the 
field of fractions of an ordered integral domain D. Show that if we define § positive to mean 
that ab is positive in D, this turns F into an ordered field as in Definition 5.3.7. 


Exercise 6.4.10 Show that in an ordered integral domain D any nonzero square b* must be 
positive. Next suppose we create the field of fractions F of D as in Definition 6.4.1, and we 
know that F is an ordered field. Show that then ¢ > 0 for a,b < D implies ab> 0. 


Exercise 6.4.11 Consider the integral domain Z |v/5| = {a + b/5 | a,be€Z}. What is the 
field of fractions for Z |\/5|? 
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7.1 Matrices and Vector Spaces over Arbitrary Fields and Rings like Z 


We want to restrict ourselves to a consideration of the mere basics of linear algebra. We 
will make much of this subject an exercise - assuming that you know most of this from 
calculus. You can find solutions in Dornhoff and Hohn [25], for example. Or you could 
look at whatever book you used for this part of your calculus course and ask what remains 
true if we replace R by Z, or some other field F. Most of the earlier chapters on Gaussian 
elimination, dimension, determinants work as before. So, for example, you might take the 
book [115] by Strang (or the book with the same title by various other authors) and convince 
yourself that all of the results of the early chapters work for arbitrary fields. 

The favorite calculations from linear algebra involve Gaussian elimination. It turns out 
that this is not due to Gauss at all but appears in a Chinese book - parts of which were 
written as early as 150 be. Gaussian elimination allows one to put a matrix A € F”*" into 
echelon form using elementary row operations. 

The elementary row operations over the field F are: 


(1) permute row i and row j; 
(2) multiply row i by a nonzero element of F; 
(3) replace row i by row i plus an element of F times row j. 


A matrix in row echelon form means that it has the following properties. 


(1) Rows with at least one nonzero element are above the rows of all Os. The first nonzero 
entry in a nonzero row is called a pivot. Below each pivot is a column of Os. 
(2) Each pivot is to the right of the pivot in the row above. 


Using only elementary row operations over F, you can put any matrix with entries in F 
into row-echelon form. 

You can even put any matrix in F”*” into row-reduced echelon form by using elemen- 
tary row operation (2) to normalize all pivots to be 1 and then by putting Os above all 
pivots. 


Example. Suppose the field is Z, and the matrix is 


2102 
100 2 
1000 
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Assuming we remember to compute mod 3, we can replace (row 2) by (row 2 — 2 - row 1) 
and do the same for (row 3) to get 


oOo ON 


1 
1 
1 


oo°o 


2 
1 
2 


Finally replace (row 3) by (row 3 — row 2) to get 
ie 102 
01041). 
\o 001 


This matrix is in row-echelon form. 
The row-reduced echelon form of the matrix is 


1000 
0100 
0001 


Of course, one also has the analogous elementary column operations. These are really 
the elementary row operations applied to the transposed matrix. A 


Exercise 7.1.1 Over the field Z, put the following matrix (which we shall see again in the 
section on error-correcting codes) 


1101000 
0110100 
0011010 
0001101 


into row-echelon form and row-reduced echelon form. 


Since the elementary row operations on the matrix A of a homogeneous linear equation 
Ax= 0 do not change the solution set, we get the theorem saying that a homogeneous sys- 
tem of m linear equations in n unknowns over a field F has a non-trivial solution if m <n. 
To see this, you just need to see that if Ar =0, and you perform the elementary row opera- 
tions on A, to produce a new matrix A’, you will still find that A’x =0. Moreover, you can 
reverse all the row operations to bring A’ back to A with the same sort of operations. This 
means that A’x=0 implies Ar= 0. This same argument is also clear from the elementary 
matrix version of the row operations. For the elementary row operations each multiply A 
on the left by an invertible matrix U with elements in the field F - as is discussed in the 
next paragraph. 

We consider an example of the matrix version of the system of equations Ax =0, with 
A€ F"*" x€F". Each elementary row operation on A corresponds to finding an m x m 
non-singular matrix U with entries in the field F and replacing A by UA. Here, as usual, 
we define matrix multiplication in the same way as we did over R with formula (1.3) in 
Section 1.8 and the statement following it. Here, as usual, a non-singular matrix is a square 
matrix with nonzero determinant - or equivalently a matrix in F”“” with an inverse for 
multiplication in F”*™” - see Exercise 7.3.13. For example, the first operation we did to the 
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matrix in the example above was to replace (row 2) by (row 2 — 2- row 1). This is achieved 
by multiplying the matrices below - remembering that the coefficients are in Z,: 


Pee Oe 10 eee 
110};1002}={o1 011}. 
logri} \a ero) lio00 


The second operation to replace (row 3) with (row 3 — 2 - row 1) is achieved by multiplying 


the matrices below: 
100 210 a ( 2102 
010 0101;=;01 0147. 
101 1000/ loa 03 
The third operation to replace (row 3) by (row 3 — row 2) is achieved by multiplying the 
following matrices: 


100 2102 2102 
010 0o101;/=;0101 
02 1 0102 000 
To put 1s on the diagonal we multiply: 
0) 


ele eae 
\o0 1) \o001/ \ooo1 


Then to put zeros in the non-diagonal entry of the second row, we multiply 
ee ee ey pee es 
012 0101;=;010 Of. 
loo) Wow eas lone a4 


Similarly one can produce zeros in the non-diagonal entries of the first row to get to the 
row reduced-echelon form matrix. 


jo) 


1000 
0100 
0001 
Exercise 7.1.2 Write down the 3 x 3 matrix that is needed to produce the result of the last 
sentence. 


Exercise 7.1.3. Write down the product of all the 3 x 3 matrices that we used to go from 
2102 1000 
1002 to 01007, 
1000 0001 

Example. Suppose the field is Z3 and the matrix is that of the previous example 


2102 
A={100 2 
1000 
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Then the corresponding system of equations is Ar=0, 'x= (x1, x2..13): 
2X, +X. + 244 =0 
Xx, + 22x, =0 


x, =0. 


An equivalent system is that corresponding to the row-reduced echelon form matrix 


which is 
ay =0 
X2 =0 
X4 =0 


The result is that +,=.«, =x, =O and the other coordinate x, is arbitrary (i.e., free to be 
whatever it wants to be in Z;). Of course it was pretty obvious at the beginning that the 
equations did not involve x3. Moreover it might have been better to interchange row 3 and 
row 1 first. Gaussian elimination always involves choices. A 


Exercise 7.1.4 Over the field Z. write down the homogeneous system of equations Gxr=0 
corresponding to the following matrix 


1101000 
a 0110100 
~ 10011010 

0001101 


From Exercise 7.1.1, the row-reduced echelon form of G is G' = (I, A), where I, is the 4 x 4 
identity matrix. If 


—-A 
2 7 ( ) 
I, 
solve xH=0, for x € Z}. How does the set of such vectors x compare with the set of vectors 
y=uG, forue Zi? 


The three elementary row operations are obtained by multiplying the matrix on the left 
by three types of elementary matrices for which we give 3 x 3 examples with entries that 
are assumed to be in the field F=Z,: 


1. a permutation matrix U, which is a square matrix such that each row and each column 
has one entry equal to 1 while the rest of the entries are 0 


oOF 
re OO 


(0) 
1], to interchange (rows 2) and (row 3); 
0 


210 


Part II Rings 


2. a matrix having all entries 0 except for the diagonal entries, all of which are 1 except 
for the ii entry which is some a € F* 


100 
0 5 0], to multiply (row 2) by 5 (mod 7); 
001 


3. a matrix having all entries 0 except for the diagonal entries, all of which are 1, and the 
ij entry which is some a € F 


100 
\o 1 0}, to replace (row 2) by ((row 2) + 5(row 1)). 
001 


The elementary matrices corresponding to the elementary row operations on F"*" can be 
shown to generate the general linear group GL(n,F) consisting of all invertible nx n 
matrices with entries in the field F. 


Exercise 7.1.5 Give some examples to show that if you multiply a matrix A € F*? on the 
left by an elementary matrix Uc F°*?, it produces elementary row operations on A. Show 
also that the elementary matrices in your examples are invertible for matrix multiplication. 


Exercise 7.1.6 Show that nx n elementary matrices generate the general linear group 
GL(n, F), for any field F. 


Exercise 7.1.7 Show that, if UE GL(n,F), the map from A€ F"*™ to UA gives a group 
action (as in Definition 3.7.1) of the general linear group GL(n,F) on F"*™, for any field 
F. Give representatives of the orbits of this group action when n= 3 and m=2. 


Exercise 7.1.8 Suppose UE GL(n, F) and AC F"*™. Why is it that the solutions x€ F” of 
Ax =O are the same as the solutions to UAx=0? 


Question. Does Gaussian elimination work over Euclidean domains D like Z and F|x|? The 
answer is that it does work well. In fact, you could even allow D to be a principal ideal 
domain - meaning that D is an integral domain such that every ideal in D is a principal 
ideal. 


The elementary row operations over Euclidean domains D are: 


1. permute row i and row j; 
2. replace row i by a unit in D* times row i; 
3. replace row i by row i plus an element of D times row j. 


Again these elementary row operations on a matrix A € D”*" correspond to multiplication 
of A on the left by matrices in the general linear group GL(n, D) which consists of n x n 
matrices U with elements in D such that the determinant det(U) is a unit in D. Here, as 
usual, the group operation is matrix multiplication. For then the formula for the inverse of 
a matrix from Exercise 7.3.12 below implies that D~' has entries in D. Similarly one can 
define column operations over D. These will be necessary if one wants to diagonalize the 
matrix A. 
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Elementary row and column operations over Z, for example, allow us to put a matrix of 
integers into the Smith normal form, meaning a matrix such that all entries are 0 except 
those on the diagonal and such that if d; is the ith diagonal entry then d; divides dj,, for 
all i given by 


ds 
. , with dj> 1. (7.1) 
0 G sini 


The prime power divisors of the diagonal entries are called elementary divisors and are 
unique up to multiplication by units in D. This result in turn can be used to prove the funda- 
mental theorem of finitely generated Abelian groups stated below. Here finitely generated 
group G just means that G has a finite generating set. The Smith normal form is also useful 
in computations of algebraic number theory and algebraic topology. 


1 2 
Example. Put A= € ) into Smith normal form over Z. 


We replace (row 2) by ((row 2) — 3 - (row 1)). This gives 


(3 4) > 4): 


Then replace (row 1) by ((row 1) + (row 2)) and (row 2) by —(row 2) to get 


(0) 


So the elementary divisors of A are 1 and 2. A 


We did not have to use column operations in the preceding example, but it was pretty 
simple. In general, you need column operations. Doing this calculation can be quite diffi- 
cult - especially if you replace Z with the ring of polynomials Fiz], F a field. Luckily now 
software exists to help. I use Scientific Workplace both to type this book and to compute 
Smith normal forms for matrices over Z or F{x]. Now I can just put the mouse on the matrix 
and then hit tools, matrices, Smith normal form, and the answer pops out: 


(; Smith normal form: [ ) 
3 4 0 2 


What a change from the time I taught a course in which everyone got a different answer 
for the elementary divisors of 3 x 3 matrices over a polynomial ring. 

To put a matrix with integer entries into Smith normal form, your first goal is to put the 
greatest common divisor of the entries in the upper left-hand corner of the new matrix. 
Start by putting the smallest entry in absolute value in the upper left-hand position. Then 
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multiply by —1 if necessary to make this guy positive. You can slowly achieve your goal 
by replacing any other entry in the first row or column by its remainder upon division by 
the first entry. Again take the smallest remainder obtained and put it in the favorite upper 
left position. At some point all the entries in the first row and column except for the first 
must be 0. An induction has begun. 


Theorem 7.1.1 (The Fundamental Theorem of Finitely Generated Abelian Groups). Any 
finitely generated Abelian group G is isomorphic to the direct sum Za, ®--- © Za, © Z". 
Here Za, and Z are viewed as additive groups. 


This is a central theorem - in both group theory and algebraic number theory, as well 
as algebraic topology. You can find a proof in the spirit of our discussion in Schreier and 
Sperner [101, Chapter IV, Section 20]. More modern books have less computational versions. 
See Dummit and Foote [28] for example. 

Let us now give a sketch of a proof of the fundamental theorem of Abelian groups 
following Schreier and Sperner - feeling free to cheat if necessary. The idea is that you 
have a presentation of your group G with a finite set of generators. Fix a generating set 
{g1,---,gr}. Any element g € G has an expression of the form g =gj' --- g)’, for some 1; € 
Z. We say the generating set {g),...,9,} is a basis of G if gi! --- g’'= gi --- g’ implies 
g;' =9;', for alli=1,...,1. For the generating set to be a basis, it suffices to show that the 
representation of the identity element is unique. 

To show there exists a basis, first define the relation vectors v=(v;,...,v,) €Z" to be 
those corresponding to relations g”'--- gr =e, the identity of G. Note that the set R of 
relation vectors forms a subgroup of the group Z” under componentwise addition. In fact, 
modern authors would call R a submodule of the Z-module Z’. This means that R is closed 
under scalar multiplication by some integer n €Z as well as addition. We will say a bit 
about modules after we discuss vector spaces at the end of this section. 

Our relation vectors R form a matrix with r columns and infinitely many rows. We need 
to go all Smith normal form on this matrix. Certainly it is legal to perform elementary row 
operations on the matrix - except for the infinite number of rows part.’ Well, induction 
should work. But we need to be able to do elementary column operations too. How is this 
possible? For this, you must be willing to replace or renumber the generators. To interchange 
columns i and j you must interchange generators g; and g;. To multiply column i by —1, 
you must replace generator g; by g; '. To replace (column j) by ((column j) — n(column 3), 
for n € Z, you need to replace generator g; by generator 9:9; 

Performing the elementary row and column operations leads to a matrix in Smith normal 
form - in which the bottom infinite number of rows are O and a diagonal matrix above 
most of these 0 rows looks like the matrix (7.1) above. The basis relations are the nonzero 
relation vectors. 

What does this mean? If e denotes the identity of G, the basic relations are gj = e,...,9s = 
e, gos eae a, = e. This implies that we can leave out the first s generators as they are 
all the identity. Then, renumbering the generators, we are looking at a group with generators 
91,--++9m+n» Where the first m have finite orders. The last n generators have infinite order. 
The result is that we have the fundamental theorem of Abelian groups. 


1 However, that is really a red herring by a theorem about submodules of finitely generated free modules 
being finitely generated. See Hungerford [47, Chapter 2, Section 1]. 
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Another application of the Smith normal form replaces the ring Z of integers with the 
polynomial ring Fiz], for a field F and indeterminate x. If two n x n matrices A, B have 
entries in F, one can use the Smith normal form of A — xJ and B — xI (over the polynomial 
ring F[x]) to decide whether A and B are similar matrices, meaning that B= UAU™' for 
some invertible matrix Uc F"*”". The Smith normal form is due to H. J. Smith (1826-1883). 
Ifthe Smith normal form of xI — A has diagonal entries: 1,..., 1, f,(x),fj41(4),---,fn(), the 
rational canonical form of A will be the matrix with companion matrices of the non-trivial 
polynomials fi(x) along the diagonal. If F is a field and 


f@Q\aP =G58 ° Se — 


is a polynomial in F |x], define the companion matrix 


000 0 dao 
100 0 a, 
Cfx)=]O1 0+. Oa |, (7.2) 


e 00-1 J 


Some authors transpose the companion matrix. 

A favorite reference for me on this subject is the Schaum’s outline by Ayres [4]. As a 
student I found it very frustrating that the texts I read would never tell me how to compute 
the rational canonical form of a given matrix. Instead they would tell me a few tricks that 
worked in special cases along with a general proof that was not computational. Of course 
you could argue that now you do not need to know how Scientific Workplace computes 
the rational canonical form of a matrix. Other references for canonical forms are Domhoff 
and Hohn [25] and Dummit and Foote [28]. 

The Jordan form of a matrix is slightly different. The aim is to produce a matrix as close 
to diagonal as possible. You need to consider a field F such that all elements of F[x] factor 
completely into linear factors - or to assume that all the eigenvalues of the matrix under 
consideration lie in F. Then you find that the Jordan form of matrix M with entries in F is 
a matrix that is block diagonal with Jordan blocks along the diagonal. A Jordan block is a 
matrix of the form 


( Avi ” 
_ 1 
A 1 
(0) 
Again the matrix M is similar to its Jordan form matrix. For the details, see Dummit and 


Foote [28]. 
It will be useful to try an example. 


Example. Find the Smith normal form of the following matrix over Q [x]: 


a ee e ). 
—3 x-4 
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It follows that the matrix A is similar to the companion matrix of f(x) = x* — 5x — 2, which 
is the matrix 


Cif) = ¢ | 


This happens because the Smith normal form of Ir — C( f) is the same as that of I — A. This 
is the basic reasoning behind the computational theory of the rational canonical form of 
a matrix. Again Scientific Workplace does the computation for me and gets the transpose 
of my result - probably because it has transposed the companion matrices, which some 
people will: 


Gs)=(3)G3)(0 ¥) : 


Exercise 7.1.9 Put the following matrix of integers into its Smith normal form using only 
elementary row operations over Z: 


123 
456 
789 


Exercise 7.1.10 Put the following two matrices in their Smith normal forms using only 
elementary row operations over Z;|x] : 


xr—1 1 x-—1 0 
0 x-1)’ 1 x-2)° 
Now we move to some topics that lead to the abstraction of matrices as linear functions 


(also called linear maps or transformations) in the next section. Before defining linear 
functions, we need to define the abstract version of F": namely, the vector space over F. 


Definition 7.1.1 A non-empty set V (the vectors) is a vector space over a field F (the 
scalars) if V is an Abelian group under addition and there is a function from F x V into 
V sending (a,v) €F x V to a: v=av (multiplication by scalars) such that, Vv, we V 
and Va, 8 € F, we have the following four properties: 


. av + w)=av+aw; 
. (a+ B)v=av+ Br; 


1 
2 
3. a(Bv) = (aB)p; 
4. lv=v. 


Note that we do not put arrows over our vectors. We are content to differentiate vectors 
from scalars by using Greek letters for scalars when possible. 
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Example. The plane R* = {(x, y) | x,y¢ R} is a vector space over the field R. Here addition 
is componentwise, that is, 


(x,y) + (u,v) = (e+ u,y+ v), 
and multiplication by scalars is also componentwise, that is, 
a(x, y) = (ax, ay), 
for all x,y,a@ ER. 
If you replace R by any field F, you make F* into a vector space with analogous defini- 
tions of addition and multiplication by scalars. Similarly you obtain a vector space F” over 


F, by considering the set of vectors (11,...,4n), 71¢ F, i=1,...,n, with componentwise 
addition and scalar multiplication: 


(41, --+54%n) + (01,--+59n) = (41 $M1,-- +5 tn $+ In); 


@ (85 <i+58n) =(0A,o.+,0%,), for 2,7; SF i= 1,...,7 and ae F. A 


Next we need some definitions in order to know what we mean by dimension of a vector 
space - span of a set of vectors and linearly independent set of vectors. 


Definition 7.1.2 The span of a set S of vectors in the vector space V over the field F is 
the set of finite F-linear combinations of vectors from S: that is, 


k 


Span(S) = {dam 


| ai€ F,v,€ S,Vi, with k any positive ince}. 
i=1 


By definition, a finite-dimensional vector space has a finite spanning set. If the vector 
space V does not have a finite set S of elements such that any element of V is a finite linear 
combination of elements from S, then V is infinite dimensional. We will mostly consider 
finite-dimensional vector spaces here. 


Definition 7.1.3 S is a set of linearly independent vectors if 


n 
) ajvj=O 
i=1 


for some scalars a; € F and vectors v; € S implies all a; =0. 


Definition 7.1.4 A finite subset B of the finite-dimensional vector space V over the field 
F is a basis of V if it has the following two properties: 


1. B spans V, that is, V=Span(B). 
2. Bis a set of linearly independent vectors. 


The dimension of a finite-dimensional vector space V over the field F is the size of 
a basis B of V. 
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To show that the idea of dimension makes sense, one must prove that any finite- 
dimensional vector space has a basis. We do this in the next section. Then one must show 
that any two bases of a vector space have the same number of elements. 


Exercise 7.1.11 Prove that any two bases A and B of a finite-dimensional vector space V 
over a field F must have the same number of elements. 


Hint. See Birkhoff and MacLane [9], Dornhoff and Hohn [25], Schreier and Sperner [101], 
or Strang [115]. Schreier and Sperner argue that if A={a,...,a)} spans V and B= 
{bi,...,bq} is linearly independent, then q < p (using induction on q). In particular, they 
prove that we can replace q of the vectors in A with the vectors in B and still span V. They 
start with q=0, which is a very silly case — but nevertheless legal and sensible to begin with. 
Assuming the result for the subset {b,,...,bg—1} of B, they argue that, by renumbering the 
set A, we can replace the first q—1 ee in A with the first q— 1 vectors in B and 


still span V. Then we know that b,= Sa for some 8;€ F. Not all 6; fori>q—1 can 


i= 
vanish, since the set B consists of linearly independent vectors and a; =b,,..., @g—1 = bq-1. 
In particular, the sum must have terms in it beyond the first q — 1 terms. This means that 
indeed p> q and some 3;F 0, say for i= q. Then we can replace a, with b, and still span V. 


Example. The plane R? is two-dimensional with basis {(1,0), (0, 1)}. This is the standard 
basis. Another basis of R? is {(1, 1), (2, 0)}. A 


Exercise 7.1.12 Prove the preceding statements and then the analog replacing R by Z;3. 


Exercise 7.1.13 Show that, for a matrix in row-echelon form, the set of nonzero rows is a 
linearly independent set. 


Notation. We will view elements of F” = { (11,...,%n)| 2; € F} as row vectors (mostly). This 
means we need to write vM if M € F"*™. The function v-—> vM composes badly with another 
matrix function since the function part, M, is on the right rather than the left. That is why 
I would really prefer to write column vectors. But they take up a lot of space on a page. 


Exercise 7.1.14 Show that a basis for F" is the set of n vectors with (n — 1) zero entries 
and one entry of 1: 


(100,205,010) 1, 0) 45, OF 10,0), 1520). OPO Oy. c.5 A). 


You could surely make the following definition yourself, having seen the definitions of 
subgroup and subring. 


Definition 7.1.5 A subspace W of a vector space V over the field F is a non-empty subset 


Wc V which is a vector space under the same operations as those of V. 


Example 1. The plane R? has as subspace W= {(x,0) |x € R}, the real line. Similarly Z3 
has as a subspace W= {(x, 0) | rE Z;}. A 
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Example 2. A subfield F of a field E is a vector subspace of F, considering F and Fas vector 
spaces over any subfield of F. A 


Exercise 7.1.15 If S is any subset of the vector space V over the field F, prove that Span(S) 
is indeed a vector subspace of V. 


Exercise 7.1.16 Prove that elementary row operations do not change the span of the set of 
rows of a matrix. 


Your linear algebra text may define the row rank of a matrix as the number of pivots of 
the matrix. Recall that the pivots are the first entries in the nonzero rows of the row echelon 
form of the matrix. Now we want to define the row rank using the idea of dimension. 


Definition 7.1.6 The (row) rank of a matrix Ac F"™" is the dimension of the span of 
the set of row vectors of A. 


Thanks to the following exercises, we just say “rank” rather than the row rank of a matrix. 


Exercise 7.1.17 Show that the row rank of a matrix A€ F™*" is the same as the number of 
pivots, that is, the first nonzero entries of the nonzero rows of the row echelon form of A. 


Exercise 7.1.18 Prove that the row rank of a matrix is the same as the column rank (which 
is the dimension of the span of the set of columns of the matrix). 


Exercise 7.1.19 Show that if F is a field then the set of polynomials F |x| in one indeterminate 
and coefficients in F forms a vector space over F with the usual addition of polynomials 
and multiplication by constants in F. Then show that F[x| is an infinite-dimensional vector 
space over F. 


Since we have already discussed replacing the field F in a matrix with a ring like Z or 
F|x], it makes sense to think about what happens to the definition of vector space when 
the field F of scalars is replaced with a ring R of scalars. Well, algebraists have a word for 
that - an R-module M or, maybe, a left R-module M if R is not commutative. It is really 
an exercise to define it - just copy the definition of vector space. Of course, you need to 
leave out the requirement that 1v =», for all v € M, if the ring does not have an identity 1 
for multiplication - and you need to add the word “left” if R is not commutative. Similarly 
you should be able to define submodule, module homomorphism, quotient module, direct 
product of modules. Here we are mostly interested in R = Z or F[x| - both commutative and, 
even better, Euclidean, meaning they have a division algorithm. Thus you can compute such 
things as the Smith normal form of a matrix using the division algorithm. 

A Z-module is really an Abelian group. So we are not motivated to take up the sub- 
ject of modules, which is usually left to graduate algebra courses. The Z-module that you 
get from Z" under componentwise addition and the scalar multiplication y(a1,...,@:) = 
(ya1,...,Y@n), for ~ and a; €Z, has a special name - the free Z-module of rank n. The 
rank of a free module is analogous to the dimension of a vector space. Moreover, one has 
the result that a submodule M of a free Z-module N of rank n is also free and has rank <n. 
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This was useful in our sketch of a proof of the fundamental theorem of Abelian groups. 
Many algebra books cover this subject: for example, Dummit and Foote [28], Hungerford 
[47], and Lang [65]. Some of these texts may be a bit hard to read because they are directed 
at graduate students and usually refuse to assume their rings are Euclidean integral domains 
like Z - meaning that there is a division algorithm. Instead a weaker condition is assumed - 
that all ideals are principal - when proving something implying the fundamental theorem 
of Abelian groups. 


7.2 Linear Functions or Mappings 


Modern mathematicians - Mr. Bourbaki we are thinking of you - try not to write equations 
with subscripts and subsubscripts. This means they really want to eliminate matrices from 
polite conversation. This leads to the following definition. 


Definition 7.2.1 If V and W are both vector spaces over the field F, then we say 
that a function T:V— W is a linear function, also called a linear mapping or linear 
transformation, if T has the following two properties for all v,w€ V and all a€ F: 


1. Tv+w)=T(v) + T(w), 
2. T(av)=aT(w). 


Here we will usually call a linear mapping Ta linear map for short. Elsewhere - particu- 
larly for infinite-dimensional vector spaces - it may be called a linear operator. Strangely, 
it is not called a vector space homomorphism. 


Definition 7.2.2 If Vand W are both vector spaces over the field F, then a linear mapping 


T: V— W is a vector space isomorphism over F iff T is 1-1 and onto. Then we write 
V=Wand say V is isomorphic to W. 


Example 1. If B= {b,,..., b,,} is a basis of the vector space V over the field F, then V=F™. 
m 


The mapping Mg is defined by writing v € V in the form v= Sabi, with a; € F, and then 


i=1 


setting 
Ma(v) = (a1,...,Qm). (7.3) 
We leave it as an exercise to check that Mg is indeed a vector space isomorphism. A 


Exercise 7.2.1 Check that Mg defined above is a vector space isomorphism Mp: V-> F”. 
Exercise 7.2.2 Show that if mAn, then F" is not isomorphic to F”. 


Exercise 7.2.3 Show that if the vector space 
vV=cC*(R)= {P:R Rf (2) exists Vre R, Vn > i} 


the derivative mapping Lf=f"', f€ V, is linear. 
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Example 2. Consider the linear mapping T: Z5 — Z) defined by mapping a row vector v € Z3 
to the row vector T(v) = vG, where 


OorRrFR oO 
RP Re OF 


11 000 
01 100 
00 010 
00 10 1 A 


Exercise 7.2.4 Show that if we define T as in Example 2, the image space T (Z;3) is a vector 
subspace of Z}. Find a basis for T(Z3). What is the dimension of the image space? 


Now we address the problem of showing that a basis of a finite-dimensional vector space 
exists. 


Methods to find a Basis of a Finite-Dimensional Vector Space V over a Field F 


Method 1. If you have a finite spanning set S of vectors in V, you keep deleting any vectors 
that can be written as a (finite) linear combination of other vectors from S. 


Method 2. Start with one nonzero vector or any non-empty set of linearly independent 
vectors in V and keep adding vectors from V that are not in the span of the vectors you 
already have. 


Exercise 7.2.5 Prove that the two methods to find a basis of V actually work. 


Given that Vand W are both (finite-dimensional) vector spaces over the field F, then the 
matrix of a linear mapping T: V— W with respect to the (ordered) bases B= {bi,..., Dm} 
of Vand C={c,...,¢,} of Wis the n x m array of scalars 1; € F defined by: 


n 
Matc,2(T) = (Li) 1<i<n, where T (bj) =) ici, for j= 1, seey MN. (7.4) 
1<j&m i=l 


We have defined the matrix of a linear transformation this way so that the composition of 
linear transformations corresponds to product of matrices. 


Exercise 7.2.6 Suppose V, W are vector spaces over the field F and T: V— W is a linear 
map. If B, C are (ordered) bases of V, W, show that, using the definition of Mg in equation 
(7.3) from Example 1 above - as well as the definition of Mat(T)cp in (7.4) - we have 


*Mc (T(v)) = Mat(T)c,5 'Ma(v), 


where if A= (Oy) icp 1<jem? the transpose of A denoted ' A is the matrix (ji), 


<j<m,1<i<n’ 
where the rows and columns are interchanged. 


Hint. By the linearity of T, we have for (ordered) bases B={b,,...,0,} of V and C= 
{c1,..-, Cn} of W: 


m m m n 
T | > obj | => layT (4) = lad mae 
j=l j=l j=l i=l 


Interchange the sums over i and j are you will have done the exercise. 
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Exercise 7.2.7 Suppose V, W, U are vector spaces over the field F and T: VW, S:W>U 
are both linear maps. Show that the composition So T: V— U is also a linear map. Then 
show that if B,C, D are (ordered) bases of V, W, U respectively, then 


Matp p(S 0 T) = Matp,c (S) Matc,z (T), (7.5) 


where the product on the right is the usual matrix multiplication. 


Given that V and W are both vector spaces over the field F, and T: V- W is a linear 
map, the image of T, T(V), is a vector subspace of W. The space T(V) is also called the 
range of T. Then the rank of the linear map T is defined to be the dimension of the image 
L(V) = {Lv | ve V}. 


Exercise 7.2.8 

(a) Show that, assuming V and W are both vector spaces over the field F and T: V—> W is 
a linear transformation, then the image T(V) is indeed a vector subspace of W. 

(b) Show that the rank of a linear transformation L is the same as the rank of a matrix of L 
using bases B of V and C of W. 


Exercise 7.2.9 Consider the field Z3[i], where i? + 1=0. 


(a) Show that Z;3[i] is a vector space over the field Z;. 
(b) Define the map F: Z3{i] + Zs[i] by F(z) =z?. Show that F is linear. Then find a matrix 
of F using the basis {1, i} for Z3[i] as a vector space over the field Z. 


Exercise 7.2.10 Suppose that V and W are vector spaces and T: V— W is a 1-1 linear 
transformation. Show that if B is a basis of V, then T(B) is a basis of T(V). Conclude 
that if V and W are isomorphic finite-dimensional vector spaces, then they have the same 
dimension. Moreover, show that if V= W and V is finite dimensional, so is W. 


There is another vector space associated to a linear transformation T - the kernel of T. 


Definition 7.2.3 Suppose that V,W are vector spaces and T:V-—+W is a linear 
transformation. The kernel of T, also called the nullspace of T, is 


kerT={veV| Tv=0}. 


The dimension of ker T is often called the nullity of T (or of a matrix corresponding 
to T in the case that V and W are finite dimensional). 


Theorem 7.2.1 Suppose that V and W are both (finite-dimensional) vector spaces over 
the field F, and T: V— Wis a linear map. If ker T= {v € V | Tv=0} then 


dim ker T + dim T(V) = dim V. 


Proof. Take a basis B={b,,...,bm} of kerT and extend it to a basis C= 
{b1,..., Bm, Bm4i,---;0n} of V. Then we claim that {T(bm+1),...,T(bn)} is a basis for 
T(V). A 
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Exercise 7.2.11 Fill in the details in the preceding proof. 


Exercise 7.2.12 Prove that if V is a finite-dimensional vector space over the field F, then a 
linear mapping T: V— V is 1-1 iff it is onto. 


Exercise 7.2.13 Suppose that B= {bi,..., bn} and C={c1,..., Cn} are two (ordered) bases 
of the vector space V over the field F. Let 1y(x) =x, Vx V be the identity map (which is 
certainly linear). 


(a) Show that if M= Matcp(1v), then M is invertible. 
(b) Show that for any linear map L: V— V, we have 


Matcg (1 v)Matz,a(L) = Matc,c(L)Matcp( 1 v) : 


Hint for part (b). Use equation (7.5) to see that both sides of the equality are Matc p(L). 


Definition 7.2.4 If A and B are n x n matrices with entries in the field F, we say that A 
and B are similar iff there is an invertible n x n matrix U such that B= U~'AU. 


Exercise 7.2.14 Here we ask for proofs of some useful facts about similarity. 


(a) Show that similarity is an equivalence relation on F"*". 
(b) Show that two n x n matrices A and B over the field F are similar iff they are the 
matrices of the same linear transformation T: F" + F" with respect to two bases of F". 


As we Said in Section 7.1, the Smith normal form allows one to obtain canonical forms 
of matrices (such as the Jordan form or rational canonical form) so that any matrix will 
be similar to only one matrix of a given canonical form. See Dornhoff and Hohn [25] or 
Dummit and Foote [28] for more details. This can be useful despite the reluctance of some 
applied books to consider the Jordan form of a matrix. 

For example, if one needs to do Fourier analysis on the general linear group G= GL(n, Zp) 
of invertible 2 x 2 matrices over the field Z,, p prime, one must know the conjugacy classes 
{g} ={x~!gx | x © G}. That is, one needs to know the similarity classes. 

These similarity (conjugacy) classes in GL(2, Z,) for odd primes p are 


central (5 ") 
Or 

1 

parabolic {(5 : 
: ro 
hyperbolic {(5 .) 


elliptic { (; > \, where 6 is not a square in Zp and s£0. 


where r£0; 
, where r£0; 


\, where rs£ 0 and r&s; 


See Terras [116, p. 366] for more information. 

Recall that the characteristic p of a finite field F is a prime number p which is the order of 
1 in the additive group of F. Exercise 6.1.9 showed that if F is a finite field of characteristic 
p, then F contains Z,, as a subfield. The next proposition says a little more. 
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Proposition 7.2.1 A finite field F of characteristic p (necessarily a prime) is a vector space 
over Zn. 
P 


Proof. Look at the additive subgroup H of F which is generated by 1. Then T:Z > HCF 
defined by T(n)=n- 1 is a ring homomorphism mapping Z onto H. By the definition of 
characteristic, we know that ker T= pZ. By the first isomorphism theorem, we know that H 
is isomorphic to Z/pZ = Zy. This implies that we may view Z, as a subfield of F. But then 
scalar multiplication by a¢ Z, makes sense for elements vc F, implying that Fis indeed a 
vector space over Z,. A 


Corollary 7.2.1 A finite field F of characteristic p is a vector space over Z, which is neces- 
sarily finite dimensional. If the dimension of F over Z, is n, then F=Z>. This implies that 
F has p" elements. 


Exercise 7.2.15 Show that there is no integral domain with exactly six elements. More 
generally, show that there is no integral domain with pq elements if p and q are distinct 
primes. 


Notation. When p is a prime, we write F,» for the field with p" elements since we will be 
able to show that there is only one such field up to isomorphism fixing elements of F,. 
However, many texts write GF(p") instead of F,» and they call the field a Galois field. See 
Gallian [33, Chapter 22]. The name honors Galois who first introduced the idea in 1830. 
However, Gauss had certainly considered congruences and F,, much earlier in 1799 with 
his book Disquisitiones Arithmeticae. In notation, as usual, we are following Bourbaki and 
using blackboard bold (F) rather than the usual font for F to free up the letter F for other 
uses. However, we might note that Z, has another meaning in number theory (the ring of 
p-adic integers, which will not be considered in this text). There are never enough letters! 


Example: The Quaternions. Consider a four-dimensional vector space H over R with basis 
1,i,j, k. We define multiplication by first defining how to multiply the basis vectors as in the 
multiplication table for the quaternion group in Section 3.6. That is, i? = j? =k? =ijk=—1. 
Then assume that the multiplication satisfies the usual associative and distributive laws plus 
(av) -w=a(v- w)=v- (aw), for alla € Rand v, w€ H. This gives a non-commutative ring 
- also called a division algebra - as in Definition 7.2.5 below. It turns out that you can divide 
by nonzero elements. That is because we have an analog of complex conjugate: 


Q1 + Q21+ a3j + askR=ay — A2i — aaj — agk. 


Then if v=a, + aoi+ a3j+ agk, for a, € R, we have the norm of v given by Nv=viv= 
at + a} + a3 + aj. This means that when vA0, we have v~' = +e H. 

We told some of the story of Hamilton’s discovery of the quaternions in Section 3.6 where 
we introduced the quaternion group. The quaternions have proved useful in physics and 
number theory. The construction has been generalized, replacing R by other fields. Finite 
quaternions turn out not to be so interesting as they are full matrix algebras like R"*”. A 


Exercise 7.2.16 Suppose we multiply two quaternions 


(ay + Agi + af + agk) (By + Boi + Baf + Bak) = 11 + Yai t Yah + Yak. (7.6) 
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Here the a,, 3,, and y, are in R. Show that y, = a,(; — a282 — 0303 — a4G4. Obtain similar 
equations for the rest of the y,, r= 2,3, 4. 


Exercise 7.2.17 Show that, in the quaternions H, we have X-y=Yy- %. 
Exercise 7.2.18 Use what we know about quaternions to prove that we have the multiplica- 


tive property of the norm, which says NvNw = N (vw). This gives Lagrange’s identity, which 
states that if the relationship of the ys to the as and [3s is as in equation (7.6), then 


(ar bias aa 07) (8 he +o) a es 


The following definition generalizes the quaternions. 


Definition 7.2.5 An (associative) algebra A over a field F is a finite-dimensional vector 
space over F with a multiplication operation that makes A a ring such that for all a€ F 
and x,y € A we have a(xy) = (ax)y=x(ay). The algebra is called a division algebra if, in 
addition, A has an identity 1 for multiplication and, for every a€ A, there is an inverse 


1 


a~'€A such that aa~! =a~ ‘a= 1. 


In 1878 Frobenius proved that the only associative division algebras over R are R, C, 
and HI (the quaternions). 

Another example of an algebra over a field F is A= F”*", consisting of all n x n matrices 
over the field F (under the usual componentwise addition and matrix multiplication) with 
componentwise multiplication by scalars in F. This algebra is simple, meaning that it has 
no two-sided ideals except (0) and (1)=A=F"*", 


Exercise 7.2.19 Prove the last statement. 


In 1908 Wedderburn proved a converse to the simplicity of F”*”. A special case of Wed- 
derburn’s theorem says that any simple algebra over C is isomorphic to the algebra C’*”. 
Wedderburn’s general result says that if A is a simple algebra over any field F, then there 
is a division algebra D over F such that A is isomorphic to D"*", for some integer n > 0. 


Exercise 7.2.20 Prove that any division algebra is simple. 


Another example of an algebra is the group algebra F [Gj over the field F associated to 
a finite group G. This algebra is defined as follows. Suppose that G= {g1,..., gn}. Identify 
the elements of G with a basis for a vector space over F. Then as a vector space the algebra 
F[G] consists of vectors )>;_ ,igi, for ai € F. Then we add and multiply in A as follows, 
with aj, 5; € F, 


n 


Soaigi =F SY Bidi 7 Yo (ai + B:) gi, 
i=1 i=l 


i=1 


(S20. : > id; = S- S- (a8;) gp: 


k=1 ij 
GiGj=Jk 
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If we identify g, with the identity of Gand a € F with ag,, then we have multiplication by 
scalars as 


n n 
a “8igi = S- (aB;) gi- 
i=1 i=1 
In short, our formulas say that multiplication of the basis elements of the group algebra 
F [Gj just comes from the multiplication of the group elements. Thus the quaternion algebra 
over C is just the group algebra C [Q], where Q is the quaternion group. 

Fourier analysis on G is often carried out (thanks to Emmy Noether) via the study of the 
group algebra F[G|. See Dummit and Foote [28, Chapter 15]. This requires Wedderburn’s 
theory of the structure of algebras like F/G] as a direct sum of simple algebras over F. 
I personally prefer to avoid Wedderburn theory and use the direct approach to Fourier 
analysis on finite groups in [116]. 


7.3 Determinants 


We have already referred to determinants numerous times. You should know the formula 
for 2 x 2 determinants: 


ab 
det ¢ i) =ad- be 


and the analogous formula in three dimensions which has six terms. What happens in n 
dimensions? One answer is to write a sum of n! terms - one term for every element of the 
symmetric group. In short, our formula is: 


4, *** Gy 


det Foote — S> SgN(o)A¢(1)140(2)2 "++ Ao(n)n- (7.7) 
GQn1 ++: Gin ae 
This formula is not so good for evaluating determinants - thanks to the humongous 
number of terms - even for relatively small n such as 50. It is not so good for proving 
anything about determinants either. For that we prefer the following definition. We assume 
that F is any field. It does not have to be the real numbers as it was in calculus. For many 
things it could just be a commutative ring with identity. 


Definition 7.3.1 The determinant of a matrix is a function d : F"*" — F with the following 
three properties. 


1. dis a multilinear function: that is, it is a linear function of each column, holding the 
other columns fixed; 

2. d is alternating: that is, d(A) =0 if two columns of A are equal. 

3. d(1) =1, if lis the identity matrix. 


From this definition, it is possible to deduce (7.7) and all the standard facts about 
determinants. We mostly follow Dornhoff and Hohn [25] for this discussion. 
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We will write our matrix A= (A, ...A,) € F"*", meaning that A; denotes the jth column 
of A. We will write a; ¢F for the ij entry of A. 


Exercise 7.3.1 


(a) Show that if d is a determinant then d(A,,A2, A3,..., An) = —d(Az2,A1, A3,..., An)! 
that is, switching two columns changes the sign of the determinant. 


(b) Then show that if you permute columns using o € Sn, you get d(Ag(1),---;Ao(n)) = 
sgn(o) d(Ay,..., An). 


Using the preceding exercise, we see that Definition 7.3.1 implies the following compu- 
tation 


d(AB) = d[ SC Ajbj1,---5 ) Aj, Binn 


A=! jn=l 
n n 
= So Sod (by Ag+ Bang) 
ji=l jn=1 
n n 
= s- _ Sbj.1 ++ Din d(Aj,,---,Aj,)- 
j=l jn=1 
Now we can see that, provided that the j; are pairwise distinct, the n-tuple (j1,...,jn) = 
(o(1),...,0(n)) for some permutation o € S,. Therefore the multiple sum over the jj with 


n”" terms becomes a single sum over S,, with n! terms — and we see that 


d(AB) = S> beaa)1 tes ee sgn(c)d(A). (7.8) 
oES, 
Exercise 7.3.2 How did the multiple sum over n" terms involving the (j1,...,jn) become a 


single sum over ao € S,, with only n! terms? Explain. Then explain why d (Ao(1); wt Ac(n)) = 
sgn(a)d(A). 


Hint. What property of determinants causes the terms to vanish in which j; =j, for some 


iA k? 


Now set A=I in equation (7.8) and we obtain the result we sought (using the fact that 
d(I) = 1) 


d(B) = det(B) = S/sgn(o)bo(1)1 a Ue Gijn 


o€S,, 


The preceding argument shows that there is a unique function d=det having the 
properties in Definition 7.3.1. The preceding argument also shows that 


det(AB) = det(A) det(B). 


Exercise 7.3.3 Explain why det(AB) = det(A) det(B) follows from the preceding discussion. 
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Exercise 7.3.4 Show that, in the plane, a point (x1,%2) is on the line joining the point 
(a), 42) to the point (b,, b,) iff 


oe x rl 
det} a; @ 1}]=0. 


bi by i} 


It is possible to use Definition 7.3.1 to prove the rest of the basic results that one knows 
about determinants from that calculus course: expansion by minors, Cramer’s rule, the 
formula for A~', and the Laplace expansion. Such results can be found in the references 
such as Dornhoff and Hohn [25], Dummit and Foote [28], or Schreier and Sperner [101]. We 
will also give a few exercises with some of these results. 


Exercise 7.3.5 Show that the determinant of an upper triangular matrix is the product of 
the entries on the diagonal. 


Exercise 7.3.6 Explain how elementary column operations affect determinants, deriving your 
results from Definition 7.3.1. Then show, by considering the matrix 


120 
214)€Z2%, 
311 


how Gaussian elimination can be used to compute a determinant. 


Exercise 7.3.7 As usual, if A= (aj) € F"*", we write the transpose of A as ' A= (aj;i). Show 
that det(™A) = det(A). 


Exercise 7.3.8 Suppose T: V— V is a linear mapping of a finite-dimensional vector space 
V and B is a basis of V. Using the notation (7.4) from Section 7.2, show that det(Matg p(L)) 
is independent of the basis B. 


Other references for determinants are Birkhoff and Maclane [9], Dummit and Foote [28], 
Herstein [42, Chapter 6], and any linear algebra book. The modern way of doing these things 
is called “exterior algebra” or alternating multilinear algebra. We do not have time to cover 
this subject but if you dislike messy formulas with subscripts like formula (7.7), then exterior 
algebra is for you. It is exterior algebra or the algebra of differential forms that clarifies 
Stokes’ theorem and the many other formulas of multivariable integral calculus that can be 
derived from the general version of Stokes’ theorem. For example, the view of determinants 
given in Definition 7.3.1 is important in several-variables integral calculus for helping to 
understand why the Jacobian determinant appears in the change of variables formula for 
multiple integrals. References are Schreier and Sperner [101], Lang [63], or Courant and 
John [19]. 

We wish to investigate the connection between determinants and volume. 

We take our field F to be R, the field of real numbers. Consider a parallelepiped (also 
called a parallelotope) P(A) in R” spanned by the vectors A;,...,A, making up the 
columns of a square matrix A € R”"*" 


P(A) = at 


neRozasih (7.9) 
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Check it out for n= 2. 
What are the defining properties of the volume of P(A)? 


Definition 7.3.2 A volume function v:R"*"—R is a function with the following 
properties: 


1. v(A)>0, for all ACR"; 
2. v is an additive function of the ith column, holding the rest of the columns fixed, 


for alli=1,...,n; 
3. if we multiply the ith column by a scalar c the determinant is multiplied by |c|; 
4, v(1)=1; 
5. v(A) =0 if the vectors A; are not linearly independent. 


Of course, in the case of property 5, one would not really consider the parallelepiped 
P(A) to be n-dimensional and thus it is not always included in the definition. 


Exercise 7.3.9 Check that the preceding definition is a reasonable definition of the volume 
of a parallelogram for n= 2. Draw some pictures of parallelograms. 


The determinant of the matrix A = (A;,..., A,) may be negative. What does this mean? 
It means that the vectors Ai,..., An are arranged so as to give a coordinate system which 
is “left handed” — that is, not the usual right-handed 1,..., +,-coordinates in R". The stan- 
dard right-hand coordinate system is such that when you hold your right hand out with 
palm up, fingers curled from x-axis to y-axis, then the thumb points up in the direction 
of the z-axis. The left hand would have a thumb pointing down. This question of orien- 
tation of the coordinates has only two possible answers. The right-hand rule makes many 
appearances in physics. 

Once you have this definition of the volume of a parallelepiped, you find that the volume 
of the parallelepiped P(A) in (7.9) is |det(A)|. And this begins to explain the change of 
variables formula for a multiple integral. See Lang [64] for more information on that. 


Exercise 7.3.10 State whether each of the following statements about two matrices A, BE 
F'*", where F is a field, is true or false and give reasons for your answers. Recall that we 
have defined a square matrix A to be non-singular iff det(A)A0. 


(a) If the entries of A and B are all the same except for the upper left-hand corner entry, 
where b1; =—4a,,, then det B= —det A. 

(b) Suppose A is non-singular and B is singular. Then A+ B is singular. 

(c) Suppose A is non-singular and B is non-singular. Then A + B is non-singular. 


Exercise 7.3.11 Suppose that A is an n x n matrix over a field F. Define the i,j minor Mj 
of matrix A= (dij), <; ;<y to be the determinant of the (n— 1) x (n— 1) matrix obtained 


from A by crossing out the ith row and the jth column. Prove the formula for det(A) using 
expansion by minors of the jth column: 


n 
i=1 


Use this formula to compute ae det A. 
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Exercise 7.3.12 Suppose that A is an n x n matrix over a field F such that det AK 0. Prove 
that the ij entry of A~! can be expressed as: 


1 baie 
a1)" Ma. 
detA fee a 


Hint. Use the preceding problem and the fact that the determinant of a matrix is 0 if two 
columns of the matrix are the same. If 0x; is the Kronecker delta, that is, dh =0 if RA j and 
Oer = 1, this gives 


n 
OR detA= 5° (-1)'! Myay. 
i=1 


One can use the preceding exercise to prove Cramer’s rule which gives a formula for the 
solution of n linear equations in n unknowns as a quotient of determinants. This formula - 
like many of the formulas involving determinants - is not so important for solving linear 
equations as for using the result theoretically to see some property of the solutions of the 
linear equations. Thus it would be a mistake to leave it out of a course on linear algebra, 
but this is not such a course so we will leave it out. Nevertheless we encourage the reader 
to look at the discussion of Cramer’s rule in other texts: for example, Birkhoff and Maclane 
[9] or Dummit and Foote [28]. 

Determinants are often objects of extreme prejudice - even among mathematicians. Those 
who want to compute things really hate them. Theorists hate messy formulas with lots of 
subscripts. But for those who love matrix groups, determinants are our friends. They have 
appeared all over this text already. I am also reminded that I needed some of the messiest 
formulas for determinants when updating my book [119]. In particular, a formula derived 
from the Cauchy-Binet formula below appeared in a study of a generalized central limit the- 
orem for positive matrices. It would be impossible for me to live without determinants and 
their messiness. Multivariate statisticians would also have a difficult time without knowing 
these results. See Horn and Johnson [45] or Schreier and Sperner [101, p. 112] or Wikipedia 
for the formulas stated below. 

Let b and c denote ordered sets, each consisting of r numbers between 1 and min{m, n}. 
Define Ab i ) to be the r x r subdeterminant of the m x n matrix X obtained by taking 
the rows from b and the columns from c. The Cauchy-Binet formula says that, for an m x k 
matrix L and an k x n matrix M, if r< min{m,n, k} and ais an ordered set of r numbers 
between 1 and m, while b is an ordered set of r numbers between 1 and n, we have 

AM= >) Ape) An eM. (7.10) 
1<c;<:--<c,<k 
Here the sum is over all ordered sets of r numbers between 1 and k. In the special case that 
r= 1, we are looking at the formula for matrix multiplication. 

For the Laplace expansion of the determinant of an n x n matrix M, let a denote an 
ordered set consisting of r numbers between 1 and n. Define a’ to be the complementary 
ordered set of n — r=s numbers between 1 and n that are not included in a. The Laplace 
expansion of det(M) with respect to a is: 


det(M) _ S> (aera (M) Ao (M), 
1<a<---<q<n 
where r+s=n. (7.11) 
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This formula generalizes the formula from Exercise 7.3.11 for expansion by minors - the 
case r= 1. 

We have defined a square matrix A to be non-singular iff det(A) £0. Applied to a linear 
transformation T of a vector space V into itself, this definition is equivalent to saying 
that Tis invertible (i.e., 1-1 and onto). This means that T is a vector space automorphism. 
Otherwise T is called singular. Wikipedia gives around 20 equivalent conditions for the 
non-singularity of a square matrix A over a field. 


Exercise 7.3.13 Give at least 12 equivalent conditions for the non-singularity of a square 
matrix A€ F"*" over the field F. 


One can define a non-singular matrices A € R"*", where R is a commutative ring with 
identity for multiplication, to be a matrix such that there is a multiplicative inverse A~! € 
R"*" — equivalently det(A) is a unit in R. This allows the definition of a general linear group 
over such a ring. For example, the general linear group (also called the modular group) 
GL(n, Z) consists of all n x n integer matrices A with det(A) = +1. For non-commutative 
rings, one does not have the standard concept of determinant. 


7.4 Extension Fields: Algebraic versus Transcendental 


If field F is a subfield of field E we say E is a extension field of F, as in Definition 5.3.6. 
This implies that EF is a vector space over F which may be infinite dimensional. 


Example 


1. C is an extension field of R having dimension 2 as a vector space. A vector space basis 
can be taken to be {i, 1}. 

2. C is an infinite-dimensional extension of Q. This is a bit harder to understand. We will 
say more shortly. 

3. Suppose that f(x) € F[2] is irreducible, then the quotient ring E =F [x] / (f(x)) is a field 
containing (an isomorphic copy of) F. 


To see this, note that coset representatives for F [x] / (f(x)) are the polynomials of degree 
less than n= deg f: 


n—1 
So aj’, with, a,6F, J=0,.4., "= 1. 
j=0 


This follows from the division algorithm in the same way that we got representatives for 
Z/nZ from the remainders of division of an integer by n. In short, the proof is the same as 
that of Proposition 6.3.1 for which F was a finite field. One sees also that the equivalence 
classes [1], [2],..., [2 ~'] form a basis for the vector space E= F [zx] / (f(x)) over F. Thus 
n= deg f= dim, E. You can set 0 = [1] and then you see that a vector space basis for E over 
Fis {1,0,07,...,0"—!}. Moreover 6 is a root of f (0) = 0. This generalizes the first example. 
The complex number i seems concrete to most of us, although it is called imaginary for 
good historical reasons. However, once we replace R with any field F and the polynomial 
x’ + 1 with any irreducible polynomial over F, then this thing we call 6 = |x] does indeed 
seem rather abstract or imaginary. However, we got accustomed to computation in Z/nZ 
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and thus we should be able to get accustomed to computation in F [x] / (f(x)). But here we 
have to think f(0) =0 rather than n=O. A 


Definition 7.4.1 The degree d of an extension F Cc E of fields is the dimension of E as a 
vector space over F. The notation is d=|E: F}. 


Proposition 7.4.1 Suppose we have three fields K, E, Fand FC EC K. If E is a finite-degree 
extension of F and K is a finite-degree extension of E, then K is a finite-degree extension of 
F and 


[K : F] = [K: E] [E: F]. 


Proof. Suppose {v;} is a vector space basis of K over E and {w;} is a vector space basis of 
E over F. Then we claim that {vjw;} is a vector space basis of K over F. We leave it as an 
exercise to prove this. A 


Exercise 7.4.1 Prove the claim in the proof of the preceding proposition. 


Suppose that E is a field extension of F. If a€ E, let F(a) be the smallest subfield of E 
containing a and F. It is also called the field generated by a over F or the field obtained 
by adjoining a to F. Note the difference between F(a) and F[a]. The latter is the smallest 
subring of E containing a and F. The notation is reminiscent of but not the same as that 
for the ring of polynomials F[x] over F and the larger ring of rational functions F(x) over 
F since in that situation x is an indeterminate, while a is not. In fact, when a is an element 
of a large field containing the field F, F[a] and F(a) may be the same entity, as in the 
following examples F[i] = F (i). 


Examples. Suppose that i is a root of x? + 1=0 in some extension field of a field F such 
that x + 1 is irreducible. 


1. F=R: R()=C={a+bi| a,bER}=Rii). 

2.F=Q: Q(i)={a+ bi| a, be Q}=Qiil. 

3. F=F3: Fs; (i) = {a+ bi| a,b F;}=F3[i]. Here we can replace 3 with any prime p 
such that —1 is not a square mod p. A 


Definition 7.4.2 If K is an extension field of F and a€ K, we say that a is algebraic over 
F if f(a) =0 for some polynomial f(x) € F [x]. Otherwise we say that a is transcendental 
over F. 


Definition 7.4.3 If K is an extension field of F such that every element of K is algebraic 
over F, we call K an algebraic extension field of F. Otherwise K is a transcendental 
extension field of F. 


Examples. In all three of the examples above, i is algebraic over F=R, Q, or F3. A 
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Showing that something is transcendental is hard. For example, both e and 7 are real 
numbers that are transcendental over the rationals. Thus R is a transcendental extension of 
Q. We will not prove these things here. Liouville showed (in 1844) that certain real numbers 
(for example, 0.101 001000000 1.... - the decimal expansion with 1s separated by 1!, 2!, 
3!, 4!, ... zeros) are transcendental over Q. In 1873 Hermite showed that e is transcendental 
over Q. Then in 1882 Lindemann showed 7 to be transcendental over Q. A reference for the 
transcendence of e is Herstein [42]. References for more results on the subject are A. Baker 
[5] as well as S. J. Miller and R. Takloo-Bighash [78]. 


Exercise 7.4.2 Assuming that e is known to be transcendental over Q, state whether the 
following numbers are transcendental over Q. 
(a) e — 1; (b) e?"', where i=V/—1, (c) V2 + V3; (d) -e. 


Exercise 7.4.3 State whether each of the following statements is true or false and 
explain why. 


(a) The sum of two transcendental numbers over Q is transcendental over Q. 
(b) If a is algebraic over Q, then so is 2a. 


Definition 7.4.4 Suppose that K is an extension field of F and a€ K. If a is algebraic 


over F, then there is a monic polynomial in F(x] of least degree such that f(a) =0. We 
call f the minimal polynomial of a over F. 


It is admissible to say “the” minimal polynomial by Exercise 7.4.4 below. 
Example. The minimal polynomial of i over Q or R or F; is x? + 1. A 


Exercise 7.4.4 Show that the minimal polynomial of a over F is unique. 


Hint. Suppose f and g are both minimal polynomials for a over F. Use the division algorithm. 


It is tempting to think about finding the minimal polynomial of more examples over Q 
such as 2+ V3 or e?7'/", where n= 3,4,5,.... However, that seems like a subject for 
another sort of text. One would need to discuss irreducibility tests. So we will avoid the 
subject and stick to finite fields for the most part. Testing for irreducibility of polynomials 
of low degree over finite fields can be done by the same sort of method that works to test 
whether an integer is prime - assuming that integer is not too big. Just divide by the irre- 
ducible polynomials of degree < |n/2|. There are better methods for factoring polynomials 
over finite fields etc. See Lidl and Niederreiter [69]. Another reference is the handbook 
edited by Mullin and Panario [79] which includes an article on the construction of irre- 
ducibles among other fascinating topics. See other texts such as Birkhoff and MacLane [9], 
Fraleigh [32], Gallian [33], or Herstein [42] for more information on irreducibility tests for 
polynomials over the field of rational numbers. 


Exercise 7.4.5 Is x4 + 1 irreducible over F;? 


Exercise 7.4.6 Is x*+ 1 irreducible over F,,? 
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Proposition 7.4.2 Suppose that K is an extension field of F, ac K, and a is algebraic over 
F. Let f(x) be the minimal polynomial of a over F. Then f is irreducible and F |x] / (f (x)) 
is isomorphic to the field F(a). Moreover the cosets of the quotient ring F |x| / (f(x)) are 
represented by the remainders of polynomials upon division by f. Then F(a) = F{a]. 


Proof. Everything has been proved in the preceding paragraphs (including exercises) except 
the irreducibility of f, the minimal polynomial of a over F. Otherwise we have f= gh, where 
g,h € F |x| and 0 < deg g, deg h< deg f But then 0 = f(a) = g(a)h(a) implies either g(a) =0 
or f (a) =0. But this contradicts the minimality of the degree of f with f(a) =0. A 


Exercise 7.4.7 Show that if K is an extension field of F and there is a transcendental element 
a€K over F, then K is an infinite-dimensional vector space over F. In fact, show that then 
F(a) is isomorphic to the field of fractions of the polynomial ring F |x]. This is a case in 
which F(a) Fal. 


Exercise 7.4.8 Suppose that F is a finite field of characteristic p. Show that every element 
of F is algebraic over Fy. 


Hint. Look at the group F* of nonzero elements of F and recall Lagrange’s theorem. 
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Some things in the later chapters of linear algebra texts (such as eigenvalues) do not work 
well for fields like IR but instead require the field to be a larger field like C where all 
polynomials factor completely into a product of degree 1 polynomials - that is, fields 
containing all roots of det(M— xl) =0 for any square matrix M. For C, we are referring 
to the fundamental theorem of algebra which says that C is an algebraically closed field. 
A field F is “algebraically closed” if all polynomials f(x) in F[x| factor completely into a 
product of degree 1 polynomials from F(x]. Equivalently, to say F is algebraically closed 
is to say that all roots of f(x) in F[x| lie in F. Our favorite fields F,,Q, and R are not 
algebraically closed, but C is. 

The history of proofs of the fundamental theorem of algebra is very interesting. The 
first proofs (given by d’Alembert in 1746 and C. F. Gauss in 1799) had flaws. Gauss later 
published three correct proofs. Some analysis is usually required and thus we will not 
prove the theorem here. My favorite proof uses Liouville’s theorem from complex analysis. 
See Birkhoff and MacLane [9] for a topological proof. Another reference is Courant and 
Robbins [20]. 

Sadly finite fields are never algebraically closed. See the exercise below. However, there 
is an extension field of a finite field that is algebraically closed. Such a field E is called 
an algebraic closure of F,. We can say “the algebraic closure” because if we are given two 
algebraic closures of Fy, there is a field isomorphism from one algebraic closure to the other 
fixing every element of F,,. To show that an arbitrary field has an algebraic closure involves 
use of Zorn’s lemma, which is equivalent to the axiom of choice, as well as something called 
transfinite induction. We have tried to avoid these axioms here and so will avoid thinking 
about general algebraic closures. The axiom of choice may sound completely reasonable. It 
can be phrased as a statement about the Cartesian product of an arbitrary number of sets. 
Given a family of non-empty sets S;, indexed by a non-empty set J, the Cartesian product 
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I]ic7Si consists of functions f: 1 Uj<)Si such that f(i) € S;. The axiom of choice says that 
The Si is non-empty. See Dummit and Foote [28], Fraleigh [32], or Hungerford [47] for 
more information. We should also note that the axiom of choice allows one to prove the 
famous Banach-Tarski paradox. This says that a solid sphere in R? can be decomposed into 
five disjoint subsets, which can then be composed (by translation and rotation) into two 
identical copies of the original sphere. The catch is that the five sets are scattered all over 
the sphere and their volumes are undefined. 

Exercise 7.5.17 in the next section says that no finite field is algebraically closed. In fact, 
there are irreducible polynomials of every degree over any finite field and there is a formula 
for the number of irreducible polynomials of given degree over F,. See Dornhoff and Hohn 
[25] or Lidl and Niederreiter [69]. 

When I am feeling finite, this make me very sad. Of course, you can keep adding roots 
of polynomials to F,. This may be done in various ways. For example, PlanetMath.org 
envisions taking 


F yo = (JF ym (7.12) 
n=1 


to get an algebraically closed field containing F,. The finite field F, can be constructed 
as the splitting field of the polynomial x” — x, as we shall see in Section 7.3. 


Exercise 7.4.9 


(a) Is a union of fields necessarily a field? Explain your answer. 
(b) Show that F peo = UP2 Fyn is a field, using a fact from the next section that Fym CF yu 
ifm <n. 


Exercise 7.4.10 Consider the eigenvalues of the adjacency matrix of a finite undirected 
graph. Show that such eigenvalues must be algebraic numbers over Q. 


Exercise 7.4.11 Represent the field Q (e™/?) as a quotient Q{z| / (f(x)). Note that w= 
27/3 satisfies w3=1 butw"A 1, forn=1 or 2. Thus w is what is called a primitive third 
root of unity. 


Exercise 7.4.12 Do the analog of the preceding exercise with Q replaced by F,. That is, find 
a quotient F, [x] / (f(x)) containing a primitive third root of unity. 


7.5 Subfields and Field Extensions of Finite Fields 


Next suppose that E£ is a finite field of characteristic p (for prime p) with subfield F. As in 
the last section, we say that E is a field extension of F. Both E and F are extensions of Z»y 
by Proposition 7.2.1. We can view E as a vector space over F and then E =F’, where r is 
the vector space dimension of E over F. If the dimension of F over Z, is n, then |F| =p" 
and |E|= p”’. All elements of E have to be algebraic over F. So we are in the situation of 
the theorems of the preceding section. 


Notation. We write F,, for the finite field with p* elements. We can use the word “the” 
(implying uniqueness up to isomorphism) by the results of this section. So we have two 
notations if k= 1, namely, F, =Z,. We can identify F,. with the quotient ring F, [x] / (f(x) 
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for some irreducible polynomial of degree k over F, - assuming we know that such a 
polynomial exists. And we can think of the elements of Fx as polynomials of degree less 
than k with coefficients in F,. Usually we write 0 = [x] in F, [x] / (f(x)) and then elements 
of Fx have the form 


k-1 
So aid’, aie Fy. 
i=0 


Of course there will usually be many irreducible polynomial of degree k over F,,. But the 
resulting fields will be isomorphic. 


Proposition 7.5.1 We have F x C Fy» <=> k divides n. 


Proof. —> Fx C Fy» means (pt)' = p" where r is the dimension of Fp» as a vector space 
over F,,. It follows that n = kr and thus k divides n. 

<= We postpone this proof until we have proved that F,» is the splitting field of e =x 
meaning the field where this polynomial factors completely into degree 1 factors. A 


The preceding proposition implies that any subfield F of K=F,» must have the form 
F=F, such that k divides n. Moreover, the degree of K over F is [K: F] = §. 


Example. We compute [F52: :F53]=7 since 21=3- 7. A 


In Figure 7.1 we draw the poset diagram for the subfields of F 2. It is the same as the 
poset diagram for the divisors of 24 (see Figure 1.13). 


Figure 7.1 The poset of subfields of F,2 


m7 \ 
a” \ 7" 


Definition 7.5.1 The splitting field of a polynomial f(x) € F[x| over the field F is the 
smallest extension field E of F such that f factors completely into linear factors from 
E(x], that is, 


f(D= cl [(«- a;), for aj,cE€E. 
i=1 


We say “the” splitting field since, as we will prove soon, it is unique up to field isomor- 
phism fixing elements of F. Birkhoff and Maclane [9] say “root field” instead of “splitting 
field.” Presumably that is the older terminology and it stands for the field where all the roots 
of the polynomial lie, whereas splitting field stands for the field over which the polynomial 
splits completely into linear factors. 
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Example 1. The splitting field of x” + 1 over R is C, the complex numbers. A 


Example 2. The splitting field of x* — 2 over F; is 
Fos = Fs [v2] = {a+ bv2 | abe Rs} = Ps [x]/ (2 - 2). A 


Example 3. The splitting field of x7 + x+ 2 over F3. Since f(x) =x* +x +2 has no roots 
in F3, it is irreducible by Proposition 5.5.1. It is not hard to see that it is also primitive. 
Let 6 denote a root of f(x). One computes the powers @ as we did in earlier examples 
using the feedback shift register idea and sees that the multiplicative group F,(6)* = (6). 
It has order 8. Then x? + x + 2=(x — 0) (x— b), where b = &, for some j. It is not hard to 
see that j= 3. The fact that j =3 will be no shock once we have understood the section on 
Galois theory. In any case we have thus shown that F (6) is the splitting field of x7 + x + 2 
over F3. A 


Exercise 7.5.1 Fill in the details in the last example. It helps to do the table of powers of 0. 
Since 2 = b6, it follows that b= —1/0. 


Exercise 7.5.2 Find the splitting field E of the polynomial x* + x + 1 over Fz. What is the 
degree |E: F 2]? 


Theorem 7.5.1 Any polynomial f(x) € F[x], where F is a field, has a splitting field E and 
this splitting field is unique up to field isomorphism fixing elements of F. 


Proof. Existence of E (induction on deg f). If degf=1, fis already in the desired form 
c(x — a). Now for the induction step, we know that we can construct a field E, containing 
F and a root 6 of an irreducible factor g(x) of f, namely, E; = F[|/ (g(x)), with 0 = [2]. So 
we can factor f(x) = (x — 0)h(x) with he E;|2]. By induction on deg f we may assume that 
h(x) is completely factored into linear factors in an extension field E of Fj. 

Uniqueness of FE up to isomorphism. This follows from the next theorem in the special 
case that y is the identity function. A 


The proof of the following theorem is another proof by induction. 


Theorem 7.5.2 Suppose that E and E® are fields and yp: E— EP? is a field isomorphism 
mapping E 1-1, onto E®. If f(x) € E|a| and f? € E*|x| is the polynomial obtained by 
applying ¢p to all the coefficients of f, let M be the splitting field of f over E, and M? be 
the splitting field of f? over E?. Then ~ extends to an isomorphism between M and M°. 


Proof. (Induction on deg f) 
The result is clear if deg f= 1 or if M=E which implies M? = E*. 

Assume the result for fsuch that deg f< n and prove it when deg f=n-+ 1. 

We may assume fhas an irreducible factor g(x) € E [x] such that 2 < degg< n+ 1. Define 
g? to be the polynomial obtained by applying y to all the coefficients of g. Then g? € E¥ [x]. 
Suppose that a is a root of g(x) in M and b is a root of g?(x) in M®. We have the ring 
isomorphism 7 : E [x] > E* [x] obtained by applying y to the coefficients of the polynomi- 
als. Then 7 (g(x)) = g?(x) andr induces an isomorphism 7 : E [x] / (g(x)) > E? [x] / (g?(2)). 
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Recall that E [x] / (g(x)) = E(a) and E® [x] / (g?(x)) = E*(b). So we compose all the maps 
and obtain a field isomorphism extending y from E(a) onto E?(b). Now f(x) = (x — a)h(x) 
and, by induction, y extends to a field isomorphism between the splitting field F of h and 
the splitting field F° of h?. Since M= F(a) and M? = F*(b), we are done. A 


Even though the splitting field of a polynomial f(x) over F is only unique up to isomor- 
phism, we will still say “the” splitting field, as we noted earlier. For f(x) € Q [a], we can 
view the splitting field of fover Q as a subfield of the field C of complex numbers, by the 
fundamental theorem of algebra stated in Section 7.1. 

For finite fields finding an analog of C in which one can do calculus - take limits, etc. - 
is not easy. It is not sufficient just to look at F,~ from equation (7.12). One would also 
want to insure that Cauchy sequences of elements from F,~ converge in the analog of C. 
Such fields have been studied but we will not do that here. See the handbook [79]. 


Proposition 7.5.2 If F is a field with p" elements, then F must be the splitting field of x?" — x 
over F, = Zp. 


Proof. By Lagrange’s theorem from Section 3.3, any nonzero element of a field F with p” 
elements is a root of the polynomial x”"~! — 1, since the order of the multiplicative group F* 
is p” — 1. So the elements of F are roots of x (et — 1) =x?" — x. Moreover the polynomial 
3? — xhas at most p” distinct roots in F, since, by a corollary of the division algorithm, it 
has p” roots counting multiplicity. Therefore this polynomial has exactly p” roots in F, and 
Fis the splitting field of x?" — x. A 


Exercise 7.5.3 In the proof of the preceding proposition, why is it that x” — x does not have 
roots of multiplicity larger than 1? 


Next we want to know whether a general polynomial in F[1] has multiple roots. For this, 
one needs to take derivatives. We do not want to talk about limits since we usually think of 
Fas a finite field, not the real numbers. So we define the formal derivative of a polynomial 
by the formula that was proved from the limit definition of the derivative in calculus. 


Definition 7.5.2 Suppose that F is any field. Then the formal derivative of 
f(x) = a,x" + a +++ + QX+ ao, 


with a; € F is defined by 


f' (2) = ray! + (rn 1)ayyx? ++ Hay. 


Exercise 7.5.4 Show that the formal derivative has the following familiar properties of 
derivatives, for any f,g € Fix]. 


(a) (f+ 9 =f t+; 
(o) (fo) =f'9 + fa’; 
(co) (F(@)") =n (f()""") f'@. 
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Lemma 7.5.1 A polynomial f€ Flx| has a multiple root in an extension field E of F iff 
deg gcd(f, f’) > 1. Here f’ is the formal derivative of f. 


Proof. =» Suppose that f(x) = (x — a)’g(x), with g(x) € Ela] and a€ E. Then by the usual 
properties of derivatives from Exercise 7.5.4, we have f’(x) = 2(x— a)g(x) + (x— a)*g/(x). 
It follows that (x — a) divides gcd(f,f’). 

<= Suppose deg ged( f, f’) > 1. Then gced(f, f’) is divisible by x— a for some a€ E, where 
E is an extension field of F. Why does E exist? If x — a divides gcd( ff’), then a is a root of 
both f and f’. So f (x) = (4 — a)h(x) for some h € E|x]. But then f(x) = h(x) + (x — a)h'(x) 
and 0=f’(a) =h(a). This means h(x) = (x — a)k(x) for some k€ Ex] and thus (x — a)’ 
divides f(x) and a is a multiple root of f A 


Exercise 7.5.5 Answer the question in the proof of the preceding lemma. 


Theorem 7.5.3 For every prime p and every n=1,2,3,..., there is a finite field with p” 
elements which is isomorphic to the splitting field of x” — x over Fp. 


Proof. The splitting field F of x” — x over F, has p" distinct roots of f(x) =x?" — x, since 
x?" — x cannot have multiple roots using the preceding lemma and the fact that ged( ff’) = 
gced(a?" — x, -1) = 1. If we set K= {a€ F | a” =a}, we can show that K is a subfield of F. 
We leave this as an exercise. We know that x” — x splits in K and thus F= K. Moreover, 
then Fis a finite field that has p” elements. 

By Proposition 7.5.2, any other field E with p" elements must be a splitting field of x” — x 
over F,. This makes E isomorphic to F by an isomorphism which fixes F,,. A 


Exercise 7.5.6 Show that if F is the splitting field of x” —x over F,, then K= 
{a€F| a?" =a} is closed under addition and multiplication. Thus K is indeed a subfield 


of F. 
Hint. Note that (x+y)? =x” + y?, for all x,y in a field of characteristic p. 


Example. We have looked at Fs, Fo, and F2s in Sections 5.3, 5.6, and in this section. Next 
we want to consider 


Fie = Foi] / (x4 +44 1) 2 (a? + 0? +0 +d |a,b,c,deF}, 


where #* + 6 + 1=0. Here the degree of F1¢ over F, is 4 and {6?, 07,0, 1} is vector space 
basis of Fy. as a vector space over F >. A 


Exercise 7.5.7 With the notation of the preceding example, show that: 


(a) the polynomial x* + x + 1 is irreducible in F [x]; 
(b) the polynomial x* + x +1 in F2[2] is primitive, that is, 0 generates the multiplicative 
group Fig. 


Hint. For part (b), you need to make a table of powers 
6 = 036° + ar6* + a10 + ao, aie Fo, 


using the feedback shift register idea from Section 5.6. 
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Now we can finally finish a proof. 


Proof. Completion of proof of Proposition 7.5.1 (of the fact that m divides n implies F ,» is 
a subfield of F,»). If n= m-r, then p™” —1=(p™— 1)(p™—) 4 pm?) 4... + p™ 4-1). 
This says that (p™ — 1) divides (p”™’ — 1) =(p" — 1). It follows that (x?"~! — 1) divides 
(v?"-1 — 1) in F,, [x]. We leave the proof of this as an exercise. Since F,, is the splitting field 
of x?" — x over F,, the proof is over. A 


Exercise 7.5.8 Prove that if m divides n, then the polynomial (Pt 1) divides 
(x?"-1 — 1) in Fp [a]. 
Hint. Use the formula for the sum of a geometric progression in the form: 


a 


eke ee oe 


There are still several topics needed to complete our theory of finite fields. The first is to 
state a generalization of the result we gave as Exercise 6.3.10. The proof is essentially the 
same as that of the exercise. 


Theorem 7.5.4 The multiplicative group of a finite field is cyclic. 
Proof. We leave this as an exercise. A 


Exercise 7.5.9 Prove Theorem 7.5.4. 


Hint. See Exercise 6.3.10. 
Exercise 7.5.10 Find all the generators of the multiplicative group of Fo =F3[i], where 
?+1=0. 


There are lists of primitive polynomials in the books on finite fields such as Lidl and 
Niederreiter [69]. Here we give a short list containing one primitive polynomial over F, for 
each small degree. 


Some Primitive Polynomials over F2 
x+1, P+2x+1, P+ret+l1, At+rtl, Pt+xr+1, 
P+ati, a +2741, PHP 4+ 4er 4, P4e +1, 
oda ad, oa ea 1 


Exercise 7.5.11 Show that the fields F, |[x|/(x? + x + 1) and F, [x|/ (23 + 2° + 1) are isomor- 
phic. Define the isomorphism. 


The following exercise is important for the next section. 


Exercise 7.5.12 Show that the mapping of F » onto itself defined by o, (x) =x? is a field 
automorphism (called the Frobenius automorphism) fixing elements of F , (viewing F,, as a 
subfield of Fy»). Such field automorphisms are also called field conjugations. 


Hint. Use Lemma 5.3.4. 


Vector Spaces and Finite Fields 


It is a corollary of results of this section that the polynomial x?” — x factors over F, asa 
product of all the distinct irreducible polynomials having degrees that divide n. 


Example. We seek to factor the polynomial x® — x completely over F,. Since 8= 23, and 
there are only two positive divisors of 3, we expect to get irreducible polynomials of degrees 
1 and 3. We already know two irreducible polynomials of degree 3 over F). They are 
xe +x+1and «+ 2%+ 1. Both of them have as roots generators of Fg. Thus they must 
divide x® — x. We know that 0 and 1 are also elements of Fg. That gives the other two 
divisors of x8 — x. So we find that 


e—gx=a(r—-1)(P +441) (P4241). 


If @ is a root of x? +x-+ 1, one can show that x7 + x + 1= (4-6) (x — 67) (x — 0%). The 
polynomial 1° + x7 + 1 has roots 63, 6°, and 6°. A 


Exercise 7.5.13 Check that x® — x= x(x — 1) (29+ 4+ 1) (2 +4741) over Fy by multi- 
plying out the polynomial on the right. 


Exercise 7.5.14 Show that Fy» is the splitting field of some irreducible polynomial of degree 
n over Fy. 


Exercise 7.5.15 Factor the polynomial x° — x completely into irreducible factors over F,. 
Which factors are primitive? 


Exercise 7.5.16 Show that for any finite extension E of a finite field there is an element 
6 € E such that E=F(0). We call such an extension simple. 


Exercise 7.5.17 Show that no finite field is algebraically closed. In fact, show that for every 
finite field F and every positive integer n, there is an irreducible polynomial over F of 
degree n. 


Exercise 7.5.18 State whether each of the following statements is true or false and give a 
reason for your answer. 

(a) Fy» = Zp»; 

(b) m|n => F,, C F,; 


(c) Q [V5] =Q [v7]. 
7.6 Galois Theory for Finite Fields 


Definition 7.6.1 Suppose that m|n. The Galois group G(Fpx/Fp») (which is read as the 
Galois group of F » over Fy) is defined to be the set of field automorphisms 7 : F yn > Fn 


such that r(x) =x, VxE Fym (where Fm is viewed as a subfield of Fn»). Here by field 
automorphism we just mean that the map is a ring automorphism. 


It turns out that G(Fp:/F») is a cyclic group, as we shall see. When m= 1, G(F,"/F p) 
is generated by the Frobenius automorphism co, of Exercise 7.5.12. Such field automor- 
phisms are also called conjugations because they generalize complex conjugation, which is 
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the generator of the Galois group G(C/R). Some references for Galois theory are Birkhoff 
and Maclane [9], Dornhoff and Hohn [25], Dummit and Foote [28], and Herstein [42]. The 
fundamental theorem of Galois theory basically says that there is a 1-1 correspondence 
between intermediate fields E such that Fyn C EC Fp and subgroups H of G= G(F »/Fpn). 
The subgroup H of G corresponding to E is H = H(E) = {7 € G|rx =x, Vx E}. The inter- 
mediate field E corresponding to a subgroup H of G is E= E(H) = {x € F,»|rx= x, Vr € A}. 
The fundamental theorem of Galois theory says that the correspondence between interme- 
diate fields and subgroups is 1-1, onto, and inclusion reversing. Moreover the degree of 
the extension n= [Fp : Fp] is equal to the order of the Galois group G. So, for example, 
the poset diagram for fields FE such that Fp» CEC Fp» is the same as that for subgroups of 
G=G(F »/F pm), except that all inclusion lines are reversed. 


Example. What does Galois theory say about the extension F2«/F,? What are the inter- 
mediate fields between F, and Fs? They are Fy:, Fy2, Fos, Fs. So the only non-trivial 
intermediate fields are F, and Fi¢. The Galois group G = G (F 4s /F2) = (a2) is a cyclic group 
of order 8 generated by the Frobenius automorphism o>. The group G has exactly two non- 


trivial proper subgroups (03) and (c}) which correspond to the two non-trivial intermediate 
fields: G(Fjs/Fy») = (02) and G(F,s/Fy) = (o}). A 


Theorem 7.6.1 Suppose that p is a prime and n,m€Z* such that m|n. The Galois 
group G(Fp»/F »,) is a cyclic group of order n generated by oy, where oy is the Frobenius 


1 


automorphism defined by o,(x) =’, for all xEF p. 


Proof. We know that o, is an automorphism of F» by Exercise 7.5.12. Now suppose 7 € 
G(F,» /F,) and suppose 6 generates the multiplicative group F>.. Then 7 (0) = 0%, for some 
integer k such that 0< k< p” — 1. It follows that r(x) = +*, for all x € Fp». 

We need to show that k= p’, for some i, for then r= o, and we are done. We will prove 
k=p' by contradiction. 

Assume, contradicting k= p', that r(x) =2x* with k= p°ko, where p does not divide ky 
and ko > 1. Set f(x) =7 (1+ x) — 1—7(x). Then we can consider f(x) as a polynomial in 
F, [x] and we have the following equalities 


f(x) = 14a" -1-2 


s ko s 
(1 +2") sia? 


ee (3°) (2)? 4 thy (oP. 


This polynomial cannot vanish identically on F,,.. Why? The problem is that F,,. has p” 
elements but the degree of the polynomial is smaller than p”. But this gives a contradiction 
to the fact that - as a function on F,» - we have f(x) =7(1+-+) — 1—7(4), which must 
be 0 for all x€ F,», since 7 is a field automorphism. This contradiction completes our proof 
of the theorem. A 


Exercise 7.6.1 Suppose 0 € G(F »/Fp), a€F yx, f(x) €F px] with f(a)=0. Show that 
f(a(a)) =0. Thus elements of G(F,»/F,,) permute the roots of polynomials f (x) € F,, |]. 


Vector Spaces and Finite Fields 


Exercise 7.6.2 Consider the smallest field containing F; and roots of x7 —-2=0 and 
x? —3=0. What is the degree of E over F,? A primitive polynomial of degree 2 over 
F; is f(x) =x? + x+ 2. Let 6 be a root of f(x). What powers of 6 represent \/2 and \/3, 
if any? 


Exercise 7.6.3 Show that x* + 2° + 2x+2 is a primitive irreducible polynomial over Fs. 
What is the degree of the extension of Fs; generated by any root of this polynomial? 


Exercise 7.6.4 Use the preceding exercise to find the intermediate fields between Fs 
and F,;. Draw the poset diagram and then do the same for the corresponding Galois 
groups. 


Galois theory for extensions of characteristic 0 fields like the rationals is not so simple 
as that for finite fields. Non-commutative groups can be Galois groups. For example, one 
finds that the Galois group of Q [v/2, e7i/3) over Q is $3. See Dummit and Foote [28] or 
Gallian [33]. As we said earlier, it is also possible to write down equations of degree 5 whose 
roots over Q have Galois group S,. This is important because S, is not a solvable group 
(as defined below). Thus one has an example of a quintic (i.e., 5th degree) equation whose 
roots cannot be found by repeated radicals. We gave an example in Section 3.4. Of course, 
it is possible that there are other methods that lead to solutions of quintic equations: for 
example, using special functions such as modular functions. 

A group G is said to be solvable iff it has a series of subgroups H; such that 


{e}=Ho CH, C:--CH,=G, 


where H; is normal in Hi; and the quotient Hj;1/Hj is Abelian for each i=0,1,...,r—1. 
Every commutative group is solvable. In 1963 Walter Feit and John G. Thompson proved 
that every finite group of odd order is solvable. This is easy to state but the proof required 
254 pages. I remember hearing Feit give a colloquium on the subject when I was an under- 
graduate. It soon became clear to me that the proof of this famous theorem would not be 
easily understood. 

R. Dedekind gave the first formal lectures on Galois theory in 1857. Many long-standing 
classical problems were solved using Galois theory. As we said, it was shown that one can- 
not solve all polynomial equations with rational coefficients by repeated radicals, although 
that works for polynomials of degrees 2, 3, and 4. The famous classical Greek problems of 
ruler and compass constructions (angle trisection, circle squaring, cube duplicating) were 
also proved impossible via Galois theory. 

There are analogs of Galois theory for coverings of Riemann surfaces, topological 
manifolds and graphs. See Terras [117] for a graph theory version. 


Exercise 7.6.5 Suppose that F is a finite field and f(x) € F |x] with n=degf. Define the 
reciprocal polynomial f* (x) =x"f(+). If 


F (2) = Gg" + Oy + +> + aye + Go, 
then 


PG) gt? + ya + oe Opt + Gy 
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Prove the following two facts about reciprocal polynomials, assuming that f(x) is not a 
constant and aga, 0. 


(a) The polynomial f is irreducible over F if and only if the reciprocal polynomial f* is 
irreducible over F. 

(b) If F=F,, a finite field, the polynomial f is primitive (i.e., a root 0 generates the 
multiplicative group F4(0)*) iff the reciprocal polynomial f* is primitive. 


For the next exercises we need the following definition. 


Definition 7.6.2 If F is a field and A is an nxn matrix with entries in F, define the 


characteristic polynomial of A to be pa(x) =det(A — +I), where I denotes the identity 
matrix. 


Exercise 7.6.6 Show that a3 x 3 triangular matrix satisfies its own characteristic polyno- 
mial. 


Exercise 7.6.7 Consider any matrix A over a field F. Show that there is an invertible n x n 
matrix U over some finite-degree extension field E of F such that U~'AU is upper triangular. 


Hint. Use induction on n. You need to find a basis of the vector space E" such that the matrix 
of the linear transformation v— Av is upper triangular. Start the basis with an eigenvector 
v, of A, meaning that Av) = Av). The eigenvalue X is a root of the characteristic polynomial 
pa(x) = det(A — xl) of A and thus is in some finite-degree extension F ||. With respect to 
any basis containing v» as its first element, the matrix of A looks like 


\ B : 
é a) where Cis (n— 1) x (n—1) 


Exercise 7.6.8 Prove the Cayley-Hamilton theorem, which states that A satisfies its 
characteristic polynomial. That is, p,(A) =O. 


Hint. Use Exercise 7.6.7. 


Exercise 7.6.9 If F is a field and f(x) =x" — a,_,x""! —---— a,x — dp is a polynomial in 
F [x], define the companion matrix by formula (7.2) in Section 7.1. Prove the following: 


(a) det(Cp— x1) = =i)“ fa@s 
(b) det(xC; — I) = (—1)"f*(x), where f* denotes the reciprocal polynomial to f in Exercise 
7.6.5. 


Exercise 7.6.10 Suppose that F, is a finite field and f(x) is a monic irreducible poly- 
nomial with coefficients in F,. Let @ denote a root of f(x) in an extension field of Fj. 
Let f (x) = xk — ap_yxk-! —.-- — ayx— ay. The elements of the multiplicative group Fi — 
F,(0)* have the form u=so + 5,0+---+ Sp_10*—!. Thus OPu=s) + s)0+---+s,_,08-1. 
Prove that if Cp denotes the companion matrix of f, then, writing the column vector 
w="(So, $1,--+; Sk—2, Se—1) and w' ="(sh,84,---, S49, Sp), we have w’ =Cpw. Note that 
this says that repeated multiplication of the companion matrix of f produces the log table 
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for the multiplicative group of F,(0)* which we considered in many examples such as the 
one in Section 5.6 where the field was F, and the polynomial was x7 — x— 1. In our log 
tables, if we start with wo ="(1,0,0,...,0), which corresponds to the element 1 of the field 
extension, then the jth row of the table of powers @! is the transpose of the vector w; = Cwo, 
for j=0,1,2,.... 


Exercise 7.6.11 Given that x? + 2x+ 1 is a primitive polynomial over F;, compute the table 
of powers of a root 6 of x7 + 2x + 1, using the result of the preceding exercise. 


Exercise 7.6.12 Use the solution of the preceding problem to show that the 12 roots of 
13 4 

xr—1 

in Fy 0) are 0", 0°. 0" 4450 0, 


gt Ml he 


Exercise 7.6.13 Show that the Galois group of C over R is generated by r(x + iy) =x — iy, 
for x,yER. 


Perhaps we should say a bit more about Evariste Galois, as his story is a dramatic one. 
Eric Temple Bell [8] titles his chapter on Galois “Genius and Stupidity.” Galois’ life was 
short (1811-1832) - cut short by a duel. His works are also short (60 pages) but have led 
to much of modern algebra. Much of the work was written quickly (at age 20) the night 
before the duel. The famous quote from this night is: “I have no time.” Tragically Cauchy 
lost Galois’ memoir, and Fourier died before presenting another of Galois’ memoirs to the 
French Academy. It was 14 years after the death of Galois that Liouville received his works 
and the Galois papers were published. See Edna Kramer [59] for more of this history. 

We have not proved the fundamental theorem of Galois theory. You can find a proof in 
Birkhoff and Maclane [9]. 
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8.1 Random Number Generators 


A sequence of random numbers should be something like a sequence of Os and 1s obtained 
by flipping a coin with 0 for tails and 1 for heads. How should one know if the coin is fair 
or the sequence is random? You would expect that the relative proportion of Os in a large 
number of flips should approach 0.5. Similarly you would expect that the proportion of 
two successive Os would be 0.25 in a large number of flips. 

References for this section include: D. Austin [3], R. P. Brent [11], B. Cipra [16], P. Diaconis 
[24], M. Goresky and A. Klapper [36], D. E. Knuth [54], R. Lidl and H. Niederreiter [69], 
G. Marsaglia [74], H. Niederreiter [81], W. H. Press et al. [86], and A. Terras [116]. 

There are many uses for sequences of random numbers: for example, simulations of 
natural phenomena using “Monte Carlo” methods, systems analysis, software testing, and 
cryptography. Recently Monte Carlo methods have been used to test whether gerryman- 
dering has occurred in designing congressional election districts in states like Wisconsin or 
North Carolina (see Gregory Herschlag, Robert Ravier, and Jonathan C. Mattingly [41] or 
Jonathan C. Mattingly and Christy Vaughn [75]). The idea is to use Markov chain Monte 
Carlo methods to create random redistrictings of a state and then compare election results. 
We discuss Markov chains in Section 8.4. 

Monte Carlo methods also allow you to approximate 7 f - f(x)dx, where V is the volume 
of the domain D in R", by the average value of fon a “random” finite set of points in D. 
Sadly, the first use of Monte Carlo methods seems to have been in work of Metropolis, Ulam 
and von Neumann that led to the atomic bomb. These methods appear in the list contained 
in Barry Cipra’s article [16] discussing the top 10 algorithms of the twentieth century. 

Where do random numbers come from? In the old days there were tables (e.g., that of 
the RAND corporation from 1955). One can also get random sequences of Os and 1s from 
tossing a fair coin or from times between clicks of a geiger counter near some radioactive 
material. Algebra gives us random numbers (technically, pseudo random numbers) much 
more simply, as D. H. Lehmer found in 1949. We will consider a simple example. Take a 
random walk through the multiplicative group Zj, = (3 (mod 17)). In Figure 8.1 we list 
the elements in order of the powers 3/ (mod 17). The figure is the directed Cayley graph 
X(Z*,, {3 (mod 17)}). This is very non-random! If instead, we order the vertices according 
to the usual ordering of the elements of Z1,, thinking of the group as a set of integers 
{1,2,3,..., 16}, then we get the view of the same directed graph that is found in Figure 8.2. 
This second view of the same Cayley graph looks much more random. If you imagine doing 
the same thing with a truly large prime p instead of 17, you would certainly expect to get 
a random listing of integers by taking the random walk. Assuming you take a primitive 
root mod p as your multiplier, the walk would go through all elements of the multiplicative 
group mod p. 
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Figure 8.1 The Cayley graph X(Z7,, {3 (mod 17)}) 


Figure 8.2 The same graph as in 
Figure 8.1 except that now the 
vertices are given the usual ordering 


ZA 1,2,3,4,..., 16 


The method of generating random numbers created by D. H. Lehmer in 1949 is called the 
linear congruential method. To get a sequence {X,},,., of (pseudo-) random integers, you 
fix four numbers: m, the modulus; a, the multiplier; C, the increment; and Xo, the starting 
point. Then you generate the sequence recursively with the formula 


Xn41 = aX, +c (mod m). (8.1) 


This is called a linear congruential random number generator. Let us refer to it as 
L(Xo, a,c, m). If m=17, a=3, c=0, we get the sequence represented in Figures 8.1 and 
8.2. In the old days, popular choices were m= 23! — 1 (a Mersenne prime) and m= 2?!. 
There was an infamous random number generator RANDU, which was built into the IBM 
mainframe computers of the 1960s. RANDU took m = 23!, a= 65 539, c=O. For many years 
Matlab took m = 23! — 1,a=7°,c=0. 

Clearly there are choices of a,c, mthat lead to very non-random results. For example, look 
at a= c=1. Then if Xp) =0, you get X;= 1, X, =2,... - marching through 1, 2,3,...,min 
order. Certainly we want to hit all the numbers in 1, 2,3,..., m- or at least as many as pos- 
sible. We must think about this problem a bit. The main theorem is a result of Hull and Dobell 
[46]. By the period t¢ of the linear congruential random number generator L (Xo, a,c, m), we 
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mean the minimum positive integer ¢ such that X, = X44, for large enough k. Note that the 
period cannot be m if c=0, for example. The period could then be the Euler phi-function 
¢(m) if one could choose a to be a primitive root mod m, that is, a generator of Z*. 
Of course, Z*, is not cyclic for all - or even most - values of m. The following theorem has 
the object of producing a period of m. 


Theorem 8.1.1 (Hull and Dobell). The period of the linear congruential random number 
generator defined by (8.1) is m if and only if all of the following three conditions hold: 


(1) gcd(c,m) = 1; 
(2) if a prime p divides m, then p divides (a — 1); 
(3) if 4 divides m, then 4 divides (a — 1). 


We follow the discussion of the theorem in Knuth [54, pp. 15-21]. In order to understand 
this theorem, we should first prove a lemma. 


Lemma 8.1.1 Assume that ke Z+ and a€ Z. Define “2 (mod m) to mean 


a—1 


1+a+---+a*! (mod m), 


if gcd(a — 1,m)A-1. Then if a sequence of numbers X;, is given using the recursion in (8.1), 
we have the formula 


ak 


1 
X,=a'xX+¢ (mod m). 


a — 
Exercise 8.1.1 Prove the preceding lemma by induction on k. 


Now we consider a few examples of the random number generator 
Xn+1 =aXy, +c (mod m), 


before attempting a proof of the theorem of Hull and Dobell. 


Example 1. Consider m=8, c= 5,a=3,Xp =O in (8.1). You get the sequence 0, 5, 4, 1, 0. 
The period is 4 not 8. Condition (3) of the theorem is not valid. A 


Example 2. Consider m= 8,c=3,a=5, Xo = Oin (8.1). You get the sequence 0, 3, 2,5, 4, —1, 
—2,—7, 0. The period is 8. The conditions of the theorem are valid. A 


Example 3. Consider m= 18,c=5,a=7,X) =O in (8.1). You get the sequence 


0,5,4, 15, 2,1, 12, 17, 16,9, 14, 13, 6, 11, 10,3,8,7,0. 


The period is 18 and the conditions of the theorem are valid. A 


Exercise 8.1.2 Find the period of the linear congruential random number generator 
L(Xo, a, c, m) defined by formula (8.1) if Xo =0,a=3,c=1,m= 16. 


Exercise 8.1.3 Find the period for the linear congruential random number generator 
L(X, a, c, m) defined by formula (8.1) if X) =0,a=13,c=7, m= 36. 
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Exercise 8.1.4 Find the period for the linear congruential random number generator 
L(X, a,c, m) defined by formula (8.1) if X) =0,a=5,c=2,m=45. 


Exercise 8.1.5 Find the period for the linear congruential random number generator 
L(X, a,c, m) defined by formula (8.1) if X) =0,a=16,c=2,m=45. 


Now we want to reduce our proof to the case of prime powers. To do this, we need the 
following exercise. 


Exercise 8.1.6 Suppose that m= mm, with gcd(m,,m2)=1. Using the notation set up 
after formula (8.1) for linear congruential random number generators, show that the 
period of L(Xo,a,c,m) is the least common multiple of the periods of L(X,a,c,m,) and 
L (Xo, 4,¢, m2). 


Now we want to prove the necessity of the three conditions in Theorem 8.1.1 for the case 
m= p‘, where p is an odd prime. It suffices to consider the case Xo = 0. Why? Moreover, if 
ged(c, m)A 1, the period cannot be m. To prove the necessity of condition (2), we assume 
p does not divide a — 1. Then by Lemma 8.1.1, if the period were p‘, e> 1, we would have 
a’’ — 1=0 (mod p‘*). But this contradicts a’ =a (mod p). 

To understand the necessity of condition (3), note that if m=2* and a=3 (mod 4), 
we have 


) 


-1 
=1l+a+-:-+a@ ~—!=0 (mod 2°). 


a-—1 
To see this, note that a7=1 (mod 8). So a*=1 (mod 16),a®=1 (mod 32),... If a= 
3 (mod 4), then a — 1 is twice an odd number. It follows that 5 (a — 1) =0 (mod 2°). 


Thus we get a contradiction to a =3 (mod 4). 
For the sufficiency of the conditions, see Knuth [54]. 


Exercise 8.1.7 Fill in all the details in the proof of the necessity of the three conditions for 
the case m= p‘, where p is an odd prime, in Theorem 8.1.1. 


What is the meaning of random? That is a question for statisticians who have devised 
tests to tell whether our lists are reasonably random. This is not a statistics course and 
thus we refer you to some of the references listed at the beginning of this section. Usually 
the applied mathematician wants a sequence of random real numbers in the interval [0, 1] 
which approximates being uniformly distributed. To get this from X,, you just divide by 
the modulus m which you used to generate them. Of course some properties of random 
sequences in 1,2,3,..., m will be impossible to produce using linear congruential random 
number generators: for example, repetitions of numbers. For such properties you can con- 
sider sequences produced by linear recurrences over finite fields such as (8.2) below. We 
will consider some statistical properties of random number generators arising from finite 
fields at the end of this section. 

In the late 1960s applied mathematicians using random numbers for Monte Carlo meth- 
ods became angry when they discovered that if they used the linear congruential random 
integers X,, to produce vectors in [0, 1]", form > 1, by writing v= 1(X,..., X,), they would 
have vectors lying in hyperplanes. Thus Marsaglia wrote a paper [74] proving that such vec- 
tors will fall into less than (n!m)!/" hyperplanes. For example if the modulus m= 23? and 


247 


248 


Part II Rings 


n= 10, we get less than 41 hyperplanes. Of course no one had really tested the random 
vectors produced in this way for their uniform distribution in the hypercubes. So perhaps 
it should not have been a shock. 

We wish to do a simple experiment along these lines. Again we take a fairly small prime, 


namely p= 499, 


and note that Zig) =(7 (mod 499)). We compute a vector v € (0, 1]*% 


whose jth component is the real number +, times 7/ (mod 499), identifying 7/ (mod 499) 
as an integer between 1 and 498. Here we use Mathematica’s PowerMod as described in 
Section 4.1 on public-key cryptography. If we do Mathematica’s ListPlot[v] for this 
vector of points in [0, 1], we will get a fairly random looking set of points in the plane. 


See Figure 8.3. 


1.0 | 


0.4 4 


Figure 8.3 Plot of points P; = (j, v;) whose second component is the real number 4, times 
7 (mod 499), identifying 7 (mod 499) as an integer between 1 and 498 


However, there is a pitfall in the method. We are really only allowed to think of this as 
a one-dimensional thing. For if we try to plot points (v;, v;+1) € [0, 1]?, we get Figure 8.4. 


1.0 | 


0.8 | 


0.6 5 


0.4 + 
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III 


Figure 8.4 Plot of points Pj = (v;,vj+1) whose first component is the real number ;, times 
7 (mod 499), identifying 7’ (mod 499) as an integer between 1 and 498 
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dimensions, taking points 


The same sort of thing happens in_ three 
V;,Vi41, Via) € [0,1]? to give Figure 8.5. 
is Vig-is Vita g 
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0.0 
0.0 0.5 
Figure 8.5 Plot of points Pj = (vj, ¥;41, ¥j42) whose first component is the real number 5, times 


7 (mod 499), identifying 7’ (mod 499) as an integer between 1 and 498 


Suppose now we compute another random vector for a different prime modulus. We form 
=aj times 5/ (mod 503), identifying 5/ (mod 503) 


w € [0, 1], with w; being the real number 55, 
as an integer between 1 and 502. Then we plot points P; = (v;, w;) € [0, 1] in Figure 8.6. 


Figure 8.6 Plot of points P; = (vj, w;) whose first component is the real number w times 
7) (mod 499), identifying 7/ (mod 499) as an integer between 1 and 498 and whose second 


component is the analog with 499 replaced with 503 
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So the points formed using two random number generators look more random although 
connecting some dots might create some creatures. We can do a 3D plot of points formed 
using a third random number generator. That is, we create another vector z €[0, 1], with 
z; being the real number ;4, times 3/(mod 521), identifying 3/ (mod 521) as an integer 
between 1 and 520. This gives us Figure 8.7 plotting points P; = (v;, wi, z:) € [0, 1]°. 


Figure 8.7 Points (v;, w;,z;) from three vectors v,w,z formed from powers of generators of F> for 
p= 499, 503, and 521, respectively 


Now we have no obvious hyperplanes. 
Exercise 8.1.8 Find the equations of the lines in Figure 8.4. 
Exercise 8.1.9 Find the equations of the hyperplanes in Figure 8.5. 


Applied mathematicians were in an uproar over the hyperplanes and did not want to 
use more than one generator, presumably worrying about slowing down the whole process. 
So new methods for random numbers arose. In 1995 Matlab switched to a Marsaglia gen- 
erator. Brent [11] noticed that the Marsaglia Xor Shift generator can be viewed as a linear 
feedback shift register. Press et al. [86] spend many pages criticizing the linear congruen- 
tial generators and then list them as 2/3 of their methods. Mathematica gives many basic 
methods. Not surprisingly Wolfram’s favorites - cellular automata - appear. You are also 
allowed to create your own generator. 

Derrick H. Lehmer (1905-1991) and his wife Emma were well-known number theorists of 
the last century. Derrick’s father was also a number theorist who had built a prime generat- 
ing machine. Like his father, D. H. was on the University of California, Berkeley, faculty - 
except for that short time in the early 1950s when he was fired for refusing to sign the 
university Regents’ loyalty oath. That was the era of Joe McCarthy’s un-American activity 
committee and the blacklists. Many readers may not remember this era when it was feared 
that communists were everywhere. I recommend reading the biography of D. H. Lehmer on 
Wikipedia. During the early 1950s D. H. was Director of the National Bureau of Standards’ 
Institute for Numerical Analysis. In 1952, the loyalty oath was declared unconstitutional 
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by the state Supreme Court and D. H. returned to the University of California, Berkeley, and 
chaired the mathematical department from 1954 to 1957. Emma Lehmer, being a woman, 
was never allowed to join the faculty. Yes, it was the bad old days. 

I remember the Lehmers as two of the principal organizers of the West Coast Number 
Theory conference, a conference that I first attended as a young assistant professor in the 
1970s. It is still meeting each year. Under their leadership this conference was democratically 
run. There was no elite bunch of organizers deciding who could speak and who could not. 
Instead, at the first night of the conference, anyone who wanted to give a talk would put 
their title on a piece of paper and the conference would be organized by putting talks on 
nearby subjects together. 


Some random number generators generalize the Lehmer linear congruential generator 
by involving other finite fields or rings than Z,,. They can be produced from feedback shift 
registers discussed earlier when we constructed log tables for Fin(9), where 6 is a root of 
a primitive polynomial f(x) of degree n in F,[2]. See Section 5.6 and Figure 5.4. These 
methods are also related to the construction of the Fibonacci numbers in Section 1.4. The 
sequences {s,} considered here involve elements of some finite field F,. The main reference 
that we use is Lid] and Niederreiter [69]. A reference covering more general theory than 
that over finite fields is Goresky and Klapper [36]. Sadly, going between references always 
involves changes of notation with minus signs disappearing, matrix transposes appearing 
and so forth. Reader beware! There are also differences in finite state machine diagrams 
involving up versus down as well as right versus left. 

Define a linear recurrent (also linearly recurring) sequence to be {s,} obtained using the 


recursion formula below once you are given coefficients b, do, a1,...,@p—1 €F, and initial 
values S0,S1,..., Sk-1 € Fg, 
Sntk= Op_1Sntk-1t¢+* +S, +b, n=0,1,2,.... (8.2) 


Some would call formula (8.2) a difference equation. Recall that the Fibonacci numbers 
were defined by such a recursive equation ft+2 =/fe+1 + fe over the field Q in Section 1.4. 
When b=0, the recursion (8.2) is called homogeneous. We will restrict ourselves to that 
case. Feedback shift registers are electronic circuits that can be used to produce the s,. An 


impulse response sequence has initial values (for k>2) given by so= 5; =---=Sp_2 =O 
and sp_; = 1. 

Associated to the linear recurrent sequence from (8.2) we have the characteristic 
polynomial 


fd = — gq ya! = ++ — axa 


in F, [x]. The coefficients of this polynomial are —1 times the coefficients of the homoge- 
neous sequence — ignoring b. 


Example 1. Consider the field F; and the polynomial x? — x? — x + 1. The homogeneous 
recurrence (8.2) is 


Sn4+3 =Sn42 + Sn4i — Sn- 


We will create a table of an impulse response sequence s,. That means we start with 
0,0, 1. 
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sn |} O | O} 1 | 1 | -—1 | -1 ]0)0)1 7 1} -1 7) -1'4 0 ¢) 1 1 —-1 |; -1 


We find that the sequence has period 6. That is, s,4¢ = 5S», for all n> 0. That might not 
be what you expected if you thought that the polynomial x? — x? — x+ 1 would have roots 
generating a finite field of order 27. However x — 27 — x + 1=(2* — 1) (4-1) over F3. 
When the polynomial is reducible, the period need not divide the order of the multiplicative 
group of the field extension generated by a root of an irreducible polynomial, 26 in this 
case. A 


Example 2. Consider the field F3 and the polynomial x? — x — 1. This is a primitive polyno- 
mial and so we expect period 37 — 1=8. Again, we create a table for the impulse response 
sequence. The feedback shift register diagram for this example is shown in Figure 8.8. The 
table of values coming from the recurrence: 


Sn42 = Snti + Sn. 


is given below. 


n 1|2)3 5 6 7 9} 10 | 11 12 | 13 14 15 | 16 | 17 
Sn o;1)1]-1 )0);-1 },-171;/0);)141 —-1/0 —1}-1/] 1 0 1 
The period is indeed 8: that is, s,,3=S,, for all n> 0. A 


Figure 8.8 Feedback shift register 
corresponding to example 2 


Out 


Exercise 8.1.10 Consider the field F, and the polynomial x +.x+1. Produce a table for 
the homogeneous linear recurrent sequence associated to this characteristic polynomial by 
(8.2). Take the initial entries 0,0,0,0,1 for the impulse response sequence. What is the 
period? Note that x° +.x+1=(2#3+ 2? + 1) (4? +x+ 1). Draw the feedback shift register 
corresponding to this example. 


Exercise 8.1.11 Same as the preceding exercise - except use the primitive polynomial 
w+ x? + 1 over F,. 


As we said in Exercise 7.6.10, the successive states of the finite state machine correspond- 
ing to a polynomial can be found by multiplication by a matrix called the companion matrix 
of the polynomial. Now we want to see what happens for our linear recurrent sequences. 
To study the linear recurrent sequence {s,,} defined by recursion (8.2), we associated the 
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characteristic polynomial f(x) == oa! Ss aye — op in F, [x]. Associated to this 
polynomial is the companion matrix whose definition we recall 


0o0o0.- O a 
100 - 0 a, 
010. 0 a 
CG =lo001--- 0a, |: (8.3) 
000 -:-:- 1 ap 
Now consider state vectors v, = (Sn, Spt). ++) Sntk-1 ). These are row vectors with entries 


in F,. The vector vp is the initial values vector. Suppose we multiply v, and Cy We obtain 


ooc°o 


Vil (Sey Sete sg Se 1) 


, ee 
Ov 
Oo: 
oc: 
eee 
a: 
7 
a 


k-1 
= Sut Sut25++)Snth—1) > jSn+j | =Vn+1- 
j=0 


Note that, in general, this is not the same as the result of Exercise 7.6.10 where we 
computed the log table for a finite field extension generated by a primitive polynomial f 
using the companion matrix off: There we had the same matrix but we used column vectors 
Wy and thus looked at wy41 = Cry. 

Suppose we consider a homogeneous linear recurrent sequence given by (8.2) with b=0 
and having companion matrix (8.3). Suppose that a)4 0. Then, by the following exercise, 
det(C;)A0, which means that C is an invertible k x k matrix and thus an element of the 
finite group GL(k, F,) known as the general linear group over F,. What does the periodicity 
of the linear recurrent sequence have to do with the order of the companion matrix Cy in 
the general linear group GL(k, F,)? 


Exercise 8.1.12 Show that if the companion matrix is given by (8.3) and ao£&0, then 


det(C;) A 0. 


Hint. det(C)) =(—1)* ‘ao. 


Exercise 8.1.13 What is the order of GL(k,F,)? 


Hint. The columns must be nonzero and linearly independent. How many ways can you 
choose the first column in FR? Then once the first column is fixed, how many ways are there 
to choose the second column? 


It follows from our computations that if {s,} is defined by (8.2) with b=0, ap£ 0, and 
the characteristic polynomial of the sequence is f with companion matrix C,, then if the 
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companion matrix has order r in GL(k, F,), certainly the sequence repeats after r steps 
but it may have a smaller period. There is no reason that the period should divide r in 
general. If, however, fis a primitive polynomial over F, then the order of CG is qt — 1. In 
general, if b= 0, a) 0, one might expect that the period of the sequence would divide the 
order of GL(k, F,). But the state vectors v, will not in general be the same as the vectors 
w, in our log tables for the multiplicative group of the finite field F,(0) for a root 6 of f. 

Next we find a formula for the s, in our linear recurrent sequences in terms of powers 
of the roots of the characteristic polynomial of the sequence. This is exactly analogous to 
the connection between Fibonacci numbers and the golden ratio 


_1+Vv¥5 


w) = 1.618 033 988 749 894 848 2. (8.4) 


Artists and photographers believe that a golden rectangle with its width to height ratio of ¢ 
to 1 is particularly pleasing to the human eye. That ¢ = 1 + | means that you can remove 
a square from a golden rectangle and have another smaller golden rectangle. Moreover the 
process can be continued, leading to an inspiration to artists. See Young [128] for some 
pictures and more stories about the Fibonacci numbers. Another reference is [70]. 

Now we consider the Fibonacci numbers to suggest what happens for our linear recurrent 
sequences. Recall that the Fibonacci numbers f;, satisfy the recursion fir+2=fn41 + fa, n= 
0,1, 2,3,..., starting with fo = 0 and f, = 1. Of course - unlike the linear recurrent sequences 
in finite fields - the sequence of Fibonacci numbers is definitely not periodic. The sequence 
{f,} is a strictly increasing sequence, once n is larger than 1. Recalling how we connect 
a recursion (8.2) with its characteristic polynomial, we see that the polynomial connected 
with the Fibonacci recursion is x* — x— 1 over the field of rational numbers Q. The roots 
of this polynomial are the golden ratio defined by (8.4) and its conjugate ¢’ = (1 — \/5)/2. 
We claim that f;, = 8,6" + 6, (¢’)". To find the §;, just look at the cases n=0 and n=1. 
This gives two linear equations in two unknowns. The solution is 3; =1/ /5 = —B2. Thus 
we find that, for all n=0,1,2,3,..., the Fibonacci numbers are 


1 n syn 
fu= etd —(¢')"}. (8.5) 


Exercise 8.1.14 Show that the right-hand side of formula (8.5) satisfies the same recursion 
as the Fibonacci numbers and gives the same initial values for n=O and n= 1, namely 
fo=0 and fi = 1. 


With this example in mind, we will try to do the analog in finite fields. Assume that 
{sn} is a linear recurrent sequence in the finite field F, defined by the recursion (8.2) with 
b= 0. Suppose that the characteristic polynomial of this recursion is f(x) and that f(x) has k 
distinct roots 6,,...,, in a splitting field E over F,. Then there exist 6; €E fori=1,...,k, 
such that we have the following formula for s,: 


k 
sn= > BiOP. (8.6) 
i=1 


The proof can be found in Lidl and Niederreiter [69, p. 196]. One proceeds as in the Fibonacci 
number example - making use of Cramer’s rule to solve k linear equations in k unknowns. 
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Example. Consider the linear recurrent sequence associated to the polynomial x7 — x — 1 
over F;. The recurrence is $,4 =S,41 + S,,n=0,1,2,3,... Assume that the initial values 
are Sp = O and s; = 1. The polynomial x” — x — 1 is a primitive polynomial over F3. Its roots 
in a splitting field Fy are @ and 07, making use of what we know from Galois theory. Thus 
formula (8.6) says that 


Sn = 8,0" me B03" —prt2 _ g3nt2. A 


Exercise 8.1.15 Fill in the details of the computation of the coefficients GB, and (, in the 
last formula for s, associated to the polynomial x* — x — 1. Assume that the initial values 
are Ss=0 and s, = 1. 


Next let us say a bit about statistical tests of the randomness of the linear recurrent 
sequences. We take our discussion from Lidl and Niederreiter [69, pp. 283ff]. We assume 
that the sequence has a period r large enough for the application to pseudorandom numbers. 
Then there are three tests to consider: 


1. distribution of the s,, 
2. distribution of blocks of s,, 
3. correlations of the s, with sy. 


Goresky and Klapper [36] note that such tests were formulated by S. Golomb in the 1960s. 
In order to consider the first two tests, define for a block bE Fp; 


Zm(b) = # {nlO <n<r—-1, Soyi-1 =i, Vi,i=1,...,m}. 


Here r denotes the period of the sequence. Then one wants m= 1 for test 1 and one hopes 
that Z,(b) is more or less constant, that is, uniform. For test 2 if m> 2, one hopes that 
Zm(b) is more or less constant. For test 3 one considers a Fourier transform. 

Assume that we are looking at a homogeneous linear recurrent sequence {s, } for which 
the characteristic polynomial is a monic primitive polynomial in F, [x] of degree k>m. 
Then the period is r=q* — 1. Assuming the initial states are not all 0, we run through all 
q® — 1 nonzero vectors in a period. This means that Zm/(b) is the number of nonzero vectors 
vE Fe that have b as their first m coordinates. Therefore, if k > m, 


(q'-"—1 ifb=o0, 
Zm(0) = (8.7) 
gk if DAO. 


This result says that our hopes for tests 1 and 2 are fulfilled - assuming the characteristic 
polynomial is primitive and we thus get the maximal period. 

For test 3, one makes use of the Fourier transform on Fy. It is a little easier to assume 
q=p is a prime as we have discussed the finite Fourier transform in Section 4.2. We look 
at a homogeneous linear recurrent sequence {s,} for which the characteristic polynomial 
is a monic primitive polynomial in F, [x] of degree k. Let r= p* — 1 denote the period of 
the sequence. Then consider the following sum involving the fixed non-trivial character 
x(b) = x1(b) = 2’, for b € Z. The sum - called the autocorrelation - is defined to be 
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With the assumptions of the preceding paragraph, assuming that the initial states of our 
sequence are not all 0 as well, if h is not congruent to 0 mod 7, we find that Cor(h) = —1. 
To see this, use formula (8.7) on the sequence uy, = Sy, — Syin, Which satisfies the same 
recursion as s,. Then if the period r does not divide h, we have 


Cor(h) = (q*&-! — 1) x(0) + gS ° x(b) =-1+ S> x(b) =-1. (8.8) 


bers beF, 


We used the orthogonality relation for additive characters of F,, from Lemma 4.2.1 in the last 
equality. The same argument will work for an arbitrary finite field, once one has considered 
the additive characters of the field. More details are in Lidl and Niederreiter [69, p. 284] - 
also Goresky and Klapper [36]. Golomb says a sequence like the one considered here has 
an ideal correlation function. 


Example. Consider the recursion 5,42 = Sy41 +S, over F3 with initial values sp = 0, s; = 1. 
Then we saw above that the period is 8 and we computed the s, row of the table below: 


n o]1|2 3 |4 ]5 6 |7 
Sa of1]1 -1/o /|-1/-1]/1 
Sie || tt =r) [=]? Sr lr fo 


First we consider test 3. If we set C=e?"'/? and x(a) =¢%, then we find that in this 
case Cor(1) = 2x(0) + 3x(—1) + 3x(1) =—1. This agrees with equation (8.8). As for test 1, 
we see that over a period, s, = 0 twice; s,=1 three times; and s, = —1 three times. This 
agrees with equation (8.7), when k= 2 and m= 1. The distribution is pretty uniform. Lastly 
consider test 2 for m= 2. The number of pairs (s,, 5,41) that are equal to (0,0) is 0. The 
number of pairs (Sy, Sn41) that are equal to any (bj, by) except (0, 0) is 1. This agrees with 
equation (8.7), when k= m =2. Again the distribution is pretty uniform. A 


Exercise 8.1.16 Imitate the preceding example for the linear recurrent sequence for which 
the characteristic polynomial is x° + x? + 1 over F,. In test 3, the character x is defined by 
x(a) =(—1)*. So in this case Cor(1) is the number of agreements of s, with Sp. minus 
the number of disagreements. 


8.2 Error-Correcting Codes 


References for this section include Larry L. Dornhoff and Franz E. Hohn [25], William J. 
Gilbert and W. Keith Nicholson [35], Joseph A. Gallian [33], Vera Pless [85], Kenneth Rosen 
[92], Judy Walker [123], and Audrey Terras [116]. 

Suppose that we must send a message of Os and 1s from our computer on earth to 
Professor Bolukxy’s computer on Xotl. No doubt errors will be introduced by transmission 
over such a long distance and some random 1 will turn into a O. In order for Professor 
Bolukxy to figure out my message, there must be some redundancy built in. Error-correcting 
codes are created for that purpose. The original signal s € F¥ will be encoded as re F5+”. If 
errors are added in transmission of the encoded signal, Professor Bolukxy will use a decoder 
to find the most likely original signal s’ € F} - hoping that there is enough redundancy to 
do so. Such methods are used in compact discs as well as communications with spacecraft. 
The goal of error correction is really the opposite of the goal of cryptography. Here we 
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0111000110011 


Figure 8.9 Sending a message of Os and 1s to Professor Bolukxy on the planet Xotl 


want our message to be understood. The simplest method might be to repeat the message a 
number of times. However, that would be inefficient. 

R. W. Hamming (1915-1998) of Bell Telephone Labs published his codes in 1950. He 
had been working on a computer using punched cards. Whenever the machine detected an 
error, the computer would stop. Hamming got frustrated and began work on a way to correct 
errors. Hamming also introduced the Hamming distance defined below to get a measure of 
the error in a signal. And he worked on the Manhattan project - doing simulations to model 
whether the atomic bomb would ignite the atmosphere. I cannot resist including a quote 
from Hamming’s book [40]: “we will avoid becoming too involved with mathematical rigor, 
which all too often tends to become rigor mortis.” 


Definition 8.2.1 A linear code C is a vector subspace C of F4. 


Here F, denotes the field with q elements. If the dimension of C as a vector space over Fy 
is k, we call C an [n, k]-code. Since all codes we consider are linear, we will drop the word 
“linear” and just call them “codes.” Here q will be 2 mostly. Such codes are called “binary.” 
If q=3, the code is “ternary.” 


Definition 8.2.2 The Hamming weight of a codeword x € C is |x|, which is the number 


of components of x that are nonzero. The distance between x,ye C is defined to be 
d(zx, y) = |r- yh. 


Exercise 8.2.1 For the vector space V= F{,, show that the Hamming distance d(x,y) has the 
following properties for all x,y,u€ V: 


(a) d(x, y) =d(y, x); 
(b) d(x, y) >0 and d(x,y)=0 <= x=); 
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(c) (triangle inequality) d(x, y) < d(x, u) + d(u,y); 
(d) d(x,y)=d(x+u,y+u); 
(e) d(x,y) €Z* U {0}. 


The first three properties of the Hamming distance in this exercise make it a metric on V. 
The fourth property makes it a translation-invariant metric on V. 


Exercise 8.2.2 

(a) For the prime p, show that the Hamming weight on x € F* satisfies 
ja] ah +--+. +227! (mod p). 

(b) Consider F* as a group under addition and form the Cayley graph X(F;,5), where 
S= {seF, | |s|=1}. 


Draw the graph for p=5 and n=2. 


Definition 8.2.3 If C is an [n,k|-code such that the minimum distance of a nonzero 


codeword from 0 is d, we say that C is an [n, k, d]-code. 


The following theorem assumes you decode a received vector as the nearest codeword 
using the Hamming distance. 


Theorem 8.2.1 If d= 2e + 1, an [n,k, d|-code C corrects e or fewer errors. 


Proof. Suppose distinct x, y € C are such that d(x, y) > 2e + 1. If the received word r has at 
most e errors, it cannot be in the Hamming ball of radius e about both x and y, since that 
would imply 0 < |x — y|= d(x, y) <d(w,r) + d(r,y) <e + e=2e. So the code can correct e 
errors — assuming that we can find the nearest codeword to r. A 


We need to add a few more definitions to our coding vocabulary. 

Since an [n, k] code Cis a k-dimensional vector space over F,, the code Chas a k-element 
basis. Therefore we can form a matrix whose rows are the basis vectors. This is called a 
generator matrix G of the code C. A generator matrix of an [n, k]-code is a k x n matrix 
of rank k with elements in F,. The code C is the image of the map sending the row vector 
ve Fk to vG. Since C has more than one basis, it also has many generator matrices. The 
standard generator matrix has the form 


G= (I, A), (8.9) 


where the first k columns form the k x k identity matrix I,. If the generating matrix is in 
standard form, with no errors, decoding is easy; just take the first k entries of the codeword. 
We know that we can use elementary row operations over F, to put any generator matrix 
into row echelon form and that this must be the standard form (I, A) since this matrix 
must have rank k. 
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To describe the encoding envisioned here, we take the original message viewed as a row 
vector sé Fe and then we encode the message as sG, adding redundancy to be able to 
correct errors. 

A parity check matrix H of a [n, k|-code C is a matrix with n columns and rank n—k 
such that x€ C if and only if xH= 0. (Many texts take transpose H instead.) If G= (J, A), 
then 


—A 
H= a) : (8.10) 


Exercise 8.2.3 For code C with generator matrix G= (Ip A) as in (8.9), show that xé€ C iff 
xH=0, with H as in (8.10). 


Hint. It is easy to see that C lies in the kernel of the linear transformation L sending x€ Fi 
to xH. Since 


dim ker L + dimL (Fj) =n, 


we find that the dimension of ker L is k and thus obtain the equality of the kernel of L and 
the code C. 


The parity check matrix is quite useful for decoding. See Dornhoff and Hohn [25] or 
Gallian [33] for more information. 
Our next question is: Where does all our theory of finite fields come in? 


Definition 8.2.4 A linear cyclic code is a linear code C with the property that if c= 


(Co; C1,+++3Cn—2,n—1) is a codeword then so is the cyclic permutation of c given by 


c’ = (Cn—1, Co,---, Cn—3, Cn—2)- 


Let R denote the factor ring R= F,|x]/(x" — 1). Represent elements of R by polynomials 
with coefficients in F, of degree <n. Identify codeword c= (co, C1, ..- , Cn—2, Cn—1) With (the 
coset of) the polynomial co + cyxt+--- + Cy—yx" 1. 


Theorem 8.2.2 A linear code C in R is cyclic if and only if it is an ideal in the ring 
R=F,|x]/(" — 1), using the identification of the preceding paragraph. 


Proof. First note that a subspace W of Ris an ideal if WC W, because this implies 1/ Wc W, 
for all j=1,2,3,... Thus RWC W. 
Now suppose that C is an ideal and cp + cyr+ ++: + ¢n_14" | € C. Then C contains 
H (Cy + eye ++ + Cyt") = Cyt + Cy? $0 + Cy" 
=Cp—1 + Cot + x? +++ + Cox" | (mod(x” — 1)). 


The last equality happens because x” is congruent to 1 modulo (r” — 1). So Cis cyclic. We 
leave the converse as an exercise. A 
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Exercise 8.2.4 Complete the proof of the preceding theorem. 
Question. What are the ideals A in the ring R=F,|2x|/(x" — 1)? 


Answer. Just as we found in Section 5.4 for ideals in Zi, they are principal ideals (g(x)), 
where g(x) divides x" — 1. We call g(x) the generator of A. If g(x) =co +cyx+---+ C2" 
has degree r, then the corresponding code is an [n,n — r|-code and a generator matrix for 
the code (as defined above) is the (n — r) x n matrix: 


Co Cy C2 +++ 0 O --: 0 
0 Co Cy +++ Cy G O 2-0) 
0 0 Co °++ Cpa Gia Cp os OF. (8.11) 


e 00 ::: @ CG C2 °°: ] 


Exercise 8.2.5 Show that the code described above has dimension n — r. 


Hint. g(x) =c +cx+---+¢x" has degree r, the cosets of the vectors g(x)x/, j=0,..., 
n—r-— 1, are linearly independent in the ring R. These vectors span the ideal A= (g(x)) 
since elements of A have the form f(x)g(x), for some polynomial f(x) of degree less than or 
equal ton—r-—1. 


Example: The Hamming [7, 4, 3]-code. Note that the polynomial x’ — 1 can be completely 
factored into irreducibles over F, as follows: 


xe -1=(4-1)( +244 1)(2 +27 4+ 1) € Fly. 


Take g(x) =? + x + 1€ F)[a] to generate our ideal Jin R= R=F,[2]/(x’ — 1) correspond- 
ing to the code. The codewords in C in are listed below. 


FP OF OO OF OO 
or OO OF FO 
PF OO OF fF OO 
CoO OF FP OF OO 
Co OF Ff OF OO 
or FPF OF CO CO OO 
PrP OF OC OC OO 
OF OF FF OF 
Pe OF FP FP OO Fe 
oF PP OO RP eR 
Se P FP OOF Oe 
Fe re OO OF OF Fe 
FP OOF OF Fe 
0 OF OF Ee ee 


A generator matrix corresponding to g(x) =? + ++ 1€F,[2] as in (8.11) is 


(ri covcel 
0110100 
G= . 
0011010 
0001101 A 
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Exercise 8.2.6 


(a) Use elementary row operations to put the generator matrix G above into row-reduced 
echelon (standard) form. 

(b) Explain why the codewords listed above are correct by making a table listing the 16 
elements of «<4 (the possible messages) in the first column and then listing the 
corresponding xG (the encodings of the messages) in the second column. 

(c) Explain why the minimum weight of the Hamming (7, 4, 3|-code is 3. 


Exercise 8.2.7 Imitate the preceding example except use the polynomial x? + x* + 1 to build 
the code instead of x* +x+ 1. 


Suppose g(x) h(x) = x" — 1, in F) [2], with g(x) of degree r, the generator polynomial of a 
code C and h(x) of degree k= n— r. Then we get a parity check matrix for the code from 
the following matrix associated to the polynomial h(x) = ho + hyx+--+ + hpat: 


Oo .-. O \ 
esa hyp oe 0 


No hy ia hp . (8.12) 


0 he 282 ies 
0 0 ho 


Exercise 8.2.8 


(a) Show that the matrix given in (8.12) is indeed a parity check matrix for our code with 
generator polynomial g(x) as described above and generator matrix given in (8.11). 

(b) Use (8.12) to find a parity check matrix for the Hamming (7, 4,3|-code with generator 
polynomial g(x) =x? + x+1€F,[2| in the example above. 


There is a method for constructing codes that correct lots of errors called BCH codes. See 
Dornhoff and Hohn [25, p. 442] for the mathematical details. Here we only sketch a bit of 
the theory. 

Suppose that y € FE and E> F are finite fields. Recall that the minimal polynomial of 
over F is the polynomial fe Fix] of least degree such that f(7) = 0. 

Recall that we obtained the Hamming [7, 4,3]-code by looking at the generator poly- 
nomial g(x) =x? +x+1. This is the minimal polynomial of an element 7 of Fz whose 
other roots are 7? and ¥*. So we could say that any polynomial f(x) is in our code C 
iff f(9’) =0, j= 1, 2,4. For any polynomial whose roots include the roots of g(x) must be 
divisible by g(x). 


Definition 8.2.5 A primitive nth root of 1 in a field K is a solution y to y" =1 such that 


y"A1, forl<m<n. 
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Theorem 8.2.3 (Bose-Chaudhuri and Hoquenghem, 1960) Suppose gcd(n, q) = 1. Let 7 
be a primitive nth root of 1 in an extension field of F,. Suppose the generator polynomial 
g(x) of a cyclic code of length n over Fy has y,77,...,4~! among its roots. Then the 
minimum distance of a nonzero code element from 0 is at least d. 


For a proof see Dornhoff and Hohn [25, pp. 442-443]. 

A Reed-Solomon code is a BCH code with n= q — 1. It is also assumed that 7 € Fy. These 
codes are used by the makers of CD players, NASA, and others. They can be used to correct 
amazing numbers of errors. If you suppose q=2® so that n= 255, a 5-error-correcting 
code has g(x) = (x — y)(x — y”)--- (x—y"°) of degree 10, where ¥ is a primitive nth root 
of unity. Elements of Fs are eight-dimensional vectors over F,. This code can be used as 
a code of length 8 x 255 = 2040 over F2, which can correct any “burst” of 33 consecutive 
errors. For any 33 consecutive errors over F, will affect at most five of the elements of 
Fg. See Dornhoff and Hohn [25, p. 444] for more of an explanation of this error-correction 
ability. 

Feedback shift registers are of use in encoding and decoding cyclic codes. See Dornhoff 
and Hohn [25, pp. 449ff] and Pless [85]. 


Exercise 8.2.9 Suppose that E is the splitting field of f(x) =x" —1 over F,. Here n is a 
positive integer and q is a power of a prime. Suppose in addition that gcd(n, q) = 1. Show 
that f(x) has n distinct roots in E. 


Exercise 8.2.10 Show that gcd(n, q) = 1, iff there is a primitive nth root of 1 in the splitting 
field of x" — 1 over F,. 


Example: Codes from the Hadamard Matrix. The code used in the 1969 NASA Mariner 9 
spacecraft which orbited Mars comes from the Hadamard matrix H,; = (=1")., er? 


with u, v ordered as for the corresponding numbers in binary and u- ae ujv;. This 
matrix is pictured in Figure 8.10. A 


As 
—H)s 
Os and —1s with 1s. The rows of G are the codewords of the [32, 6, 16] Reed-Muller code 
used in the Mariner Mars probe. 


The code is found by forming the new matrix G= o( ), where ® replaces 1s with 


Exercise 8.2.11 How many errors can the [32,6, 16] Reed-Muller code correct? 


Exercise 8.2.12 Consider the [4,3,2] Hadamard matrix code with generator matrix G= 
H. 

® ( a ) using the notation of the last example. Show that the dimension of the code is 
414 


indeed 3. Then show that the minimum weight of vectors in the code is indeed 2. 


The general Hadamard matrix Ho = oe ag) ee has the inductive (or recursive) 
2 2 
definition 


Ayn Ayn a 1 1 
Ant) = th H,= : 8.13 
wen (Ge Ae), with a= (7 |) (6.13) 
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me 


. 


. 


Figure 8.10 The matrix H3, where the 1s and —1s have become red and purple 


The matrix H,, is defined to be a matrix of 1s and —1s such that H,, 'H;, = ml, where I is 
the identity matrix. For H,, to exist with m > 2, it is necessary that 4 divide m. The smallest 
values of m without a construction of H, are m= 428 and 668, according to the chapter 
on combinatorial designs in the handbook edited by K. H. Rosen [92]. 

Why did Hadamard study these matrices? He wanted a matrix H,, with entries hj such 
that |hy| <1 and |det(H,,)| is maximal (i.e., m'/2), In Terras [116, p. 172], we note that 
the Hadamard matrix H). is a matrix for the linear transformation given by the Fourier 
transform (or DFT) on the group F%. 

H. B. Mann [73] gives more information on the code used in the Mariner Mars probe, as 
well as on the history of error-correcting codes. In this book one finds a limerick inspired 
by the coding theorist Jessie MacWilliams: 


Delight in your algebra dressy 

But take heed from a lady named Jessie 
Who spoke to us here of her primitive fear 
That good codes just might be messy. 


W. W. Rouse Ball and H. S. M. Coxeter [95] give more recreational aspects of Hadamard 
matrices. See also F. Jessie MacWilliams and Neil J. A. Sloane [72]. 


13 Check that the definition (8.13) implies that Hy» 'Hy» =2"I, where I is the 
identity matrix. 
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Exercise 8.2.14 Consider the [12,6] extended ternary (i.e., over F;) Golay code with 
generator matrix (I, A), where 


o1i1i1ii1i1i1 

101221 

110122 
A= 

121012 

122101 

112210 


Show that the minimum Hamming weight of a codeword is 6. 


Hint. You may want to look at some of the references such as Pless [85] for results on 
weights of vectors in such self-dual codes. 


The code in the preceding exercise was found by M. J. Golay in 1949. The ternary [11, 6] 
cyclic code can be found by factoring (over F3[2]) 


e'-1=(rx-1)@-r +r -x-1)(-P -e +r -7 +1). 


This is also a quadratic residue code. To obtain a ternary quadratic residue code we 
proceed as follows. Let p be a prime such that 3 is a square mod p. Suppose ¢ is a primitive 
pth root of unity in some field containing F3; as in Definition 8.2.5. Then let L) denote the 
set of squares in > and let L! be the set of non-squares in F7. Define the polynomials 


q(x) =|[ (x-@) and n(x)= II G@= d), 


JE jel’ 


One can show that the polynomials q(x) and n(x) have coefficients in F; and that +? —1= 
(x— 1)q()n(2). 

We mentioned the Fourier transform on the group F% in our discussion of Hadamard 
matrices. Let us say a bit more about this subject. Functions f:F% — {0,1} are called 
Boolean functions or switching functions. The Fourier analysis of such functions has many 
applications in computer science as well as the theory of voting. A recent reference is the 
book of O’Donnell [83]. There one finds, for example, a use of Fourier analysis on the group 
IF} to prove Arrow’s theorem. This theorem says that in an election with three candidates 
for office any voting rule - such as majority rule - can produce a paradoxical result unless 
one of the n voters is a dictator who decides the election. Fourier analysis in the case of the 
additive group F4 works much the same as it did for the group Z, in Section 4.2. For those 
of us made miserable by the last US election, we recommend considering the mathematical 
theory of voting as a way to work through the grief. If nothing else, your mind will be 
diverted into the construction of a dictionary relating the notation for Fourier analysis on 
F5 in [83] and the notation in [116]. There is also a theory of influence that puts California 
as the most influential state which makes me wonder how realistic the mathematical the- 
ory of voting really is. Complicated voting rules such as that of the electoral college are 
certainly worth studying, in retrospect, as is the mathematics of gerrymandering. 

Let G= F” under addition. The dual group consists of characters yq(x) = (—1)'“", where 
ax=)>j<i<, xi, if the column vectors x= (2;) and a= (a;) both come from F}. The char- 
acters are real valued in this case - unlike that of Section 4.2. The group operation for 
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characters is pointwise multiplication. Analogously to equation (4.2), we define the Fourier 
transform of a function f: F) — R to be f: F} — R, with 


Ha) = 5 Sa) xal2). 


xeF; 


In the next exercise we ask you to prove the analog of Theorem 4.2.1. 


Exercise 8.2.15 


(a) Define convolution of functions on F3 by an analogous equation to (4.1). Then show 
that fx g=f-g. 
(b) State and prove the inversion formula for the Fourier transform on F¥. 


Define the Hamming shell to be 
S,={xeF} | lal =7}- 
Define 


1, rES,, 


d5,(a) = {3 x¢ S,. 


The Fourier transform for F} has a special name, the Krawtchouk polynomial, when applied 
to the function ds. It is given by 


K,(|la||; n) = 2"6s,(b) = S— (-1) ™. 
reF, 
||a||=r 


I discuss its properties in Terras [116, p. 178]. 


Exercise 8.2.16 Consider an election involving voting for two candidates. If there are n 
voters, you can view the voting rule as a function f:F% — {0,1}. The usual such rule is 
majority rule which works to produce a well-defined winner if n is odd. When n is even, 
there could be a tie. Define maj(x) = 1 iff the Hamming weight of x satisfies ||x||> [5], and 
f(x) =0, otherwise. Here |x| is the ceiling of x - the smallest integer > x. The rule is well 
defined when n is even, but is perhaps not really a good voting rule in that case. Compute 


the Fourier transform maj if n=3 or 5. 


Hint. maj, = S° 65; 
TH 


8.3 Finite Upper Half Planes and Ramanujan Graphs 


In this section we construct a finite analog of the real non-Euclidean Poincaré upper 
half plane H consisting of points z=x-+ iy, with x,y¢€R and y>0. The Poincaré 
upper half plane has a distance element ds defined by: 


2 2 
eS 
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One can show that ds is invariant under the group of fractional linear transformations 


az+b 
—> 


, with a,b,c,d€R and ad — bc>0. 
cz+d 


Moreover, methods from Section 4.3 can be used to show that the geodesics or distance- 
minimizing curves for this Poincaré distance are half lines and half circles perpendicular 
to the real axis. Viewing these as the straight lines of our geometry makes Euclid’s fifth 
postulate false. Thus we get Henri Poincaré’s model of non-Euclidean geometry. See Terras 
[118] for more information. Number theorists are enamoured of functions on H which have 
invariance properties under the action of the modular group SL(2, Z) of fractional linear 
transformations with integer a,b,c,d and ad— bc=1. We gave the density plot of the 
absolute value of such a function in Figure 2.7 in Section 2.1. Such functions are called 
modular forms and they play a key role in modern number theory. 

Here we want to consider a finite analog of the Poincaré upper half plane. Suppose that 
F, is a finite field of odd characteristic p. This implies that q=p’. Suppose 6 is a fixed 
non-square in F,. The finite upper half plane over F, is defined to be 


Hy= {z=x+yv5 x, ye Py, yA 0}. 


We will write for z=x+ yV5 Ex, ye F, | v5| , with x, y €F4, the real part of z=Re(z) = 
4x, the imaginary part of z=Im(z) = y. Our finite analog of complex conjugate is given by 
defining the conjugate of z to be Z= x — yV6 =z! and the norm of z to be Nz= zz. 

Perhaps you will object to the use of the word “upper.” Since we have no good 
notion of> for finite fields, we use the word “upper” thinking, for example, if q= p, the 


y-coordinate of a point is in the set {1,2,..., p — 1} of “positive” numbers. That is perhaps 
a cheat and we should really view H, as a union of an upper and a lower half plane, with 
the y-coordinate of a point in the set {- ee Et You be the judge. 
b 
The general linear group GL(2,F,) of matrices g= : i) with ad— bc£0 acts on 
z€H, by the fractional linear transformation: 
_ az+ b (8.14) 
Ff td ' 


Exercise 8.3.1 Show that if we consider the action of g€ GL(2,F,) on z€ Hy, defined by 
equation (8.14) then Im(z)A 0 implies Im(gz)A0. Then show that this does indeed give a 
group action of GL(2,F,) on Hy. 

Exercise 8.3.2 


(a) Show that, with the group action in equation (8.14), 


K= {ge GL(2,F,) | gV5= vo} = { é *) 


(b) Show that K is a subgroup of G=GL(2,F,) which is isomorphic to the multiplicative 
group Fy (v5) F 
(c) Show that we can identify G/K with H,. 


a, be Fy with a? — sv'--o} 
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a bd 


Hint on (b). The isomorphism is given by é 


) as ovs, 


The subgroup K of GL(2,F,) considered in the preceding exercise is analogous 
to the orthogonal subgroup of the general linear group GL(2,R), namely, O(2,R)= 
{g € GL(2,R) | gg =1} consisting of rotation matrices. 

The finite Poincaré distance on H, is defined to be 


ia N(z—w) 
"Tm zimw’ 


The distance has values in F,. Thus we are not talking about a metric here. There is no 
possibility of a triangle inequality such as we had with the Hamming metric in Section 8.2. 
Perhaps we should call this distance a pseudo-distance instead. 


Exercise 8.3.3 Let z=x+yvV06 and w=u+ vv, with x, y,u,v €F, and yyA0. Show that 


(x—u)’ —5(y— 0)’ 
yy , 


d(z, w) = 


Exercise 8.3.4 Show that d(gz, gw) = d(z,w) for all g€ GL(2,F,) and all z,w€ Hy. 
We can draw a contour map of the distance function by making a grid representing the 


finite upper half plane and coloring the point x + yV6 according to the value of 


d(z, Vd) = ete, 


When q = 163 we get Figure 8.11. This figure should be compared with the analogous figure 
obtained using an analog of the Euclidean distance on a finite plane given in Figure 5.1. 
I see monsters in Figure 8.11. My website has a movie of such figures for various values 
of p. 


Exercise 8.3.5 Make a figure analogous to Figure 8.11 using the “Euclidean” distance 
d((x,¥), (0,0))=2? +9’, for (x,y) € Fie3 x Figs. 


Next we want to define some graphs attached to this stuff. 


Definition 8.3.1 Let ac F, and define the finite upper half plane graph X,(5,a) to 
have vertices the elements of H, and then draw an edge between two vertices z,w iff 
d(z,w) =a. 


Example: The Octahedron. Let q = 3, 6 =—1= 2 (mod 3), and a= 1. We will write i= /—1. 
To draw the graph X3(—1, 1) we need to find the points adjacent to i for example. These 
are the points z=. + iy such that 


d(z, i) = 
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Figure 8.11 Color at point z=x+ yv. 5 in Hie3 is found by computing the Poincaré distance 


d(z, 3) 


This is equivalent to solving x? + (y — 1)* =y, for x,y € Z; with yA 0. Solutions are the 
four points 1+ i,1—i,—1-+i,-—1-—.To find the points adjacent to any point a + bi€ H, 
ba 
01 
is drawn on the left in Figure 8.12. It is an octahedron. A 


just apply the matrix ( ) to the points +1 +i that we just found. The graph X3(—1, 1) 


The adjacency matrix A of the octahedron graph is the 6 x 6 matrix below of Os and 1s 
where the i, j entry is 1 iff vertex i is adjacent (i.e., joined by an edge) to vertex j. 


> 
lI 
= - © | -& © 


1 
1 
0 
1 
1 
0 


-e- CO FP FF CO KF 
oF Fe CO KF 
SS = Oe SS C2 
SS a ee OE 


The eigenvalues \ € C of A are the solutions of det(A — Al) =0. The set of eigenvalues is 
spec (A) = {4, —2, —2,0, 0, 0}. Note that the second largest eigenvalue in absolute value, 
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S ey \ 

Lf) wr 

JASN 
\ We SN 

SS 


ee 


Figure 8.12 The graph on the left is X3(—1, 1), an octahedron, and that on the right is X5(2, 1) with 
the edges in green. The pink dashed lines on the right are the dodecahedron 


which is | \| = 2, satisfies |A| < 2\/3 £3.46. This means that the graph X;(—1, 1) is what is 
called a Ramanujan graph. 

Figure 8.12 also shows X;(2,1) on the right. The solid lines are the edges of the graph. 
The dotted lines are the edges of a dodecahedron. We can view the graph Xs(2, 1) as that 
which you get by putting a five-pointed star on each face of a dodecahedron. 

A graph X is called k-regular if there are k edges coming out of every vertex. We say that 
a k-regular graph is a Ramanujan graph if for all eigenvalues \ of the adjacency matrix 
such that |A/=k, we have |\| < 2\/k — 1. This definition was made by Lubotzky, Phillips 
and Sarnak in a paper from 1988. It turns out that such graphs provide good communication 
networks as the random walk on them converges rapidly to uniform. We will say more about 
that sort of thing in the next section. In the 1980s Margulis and independently Lubotzky, 
Phillips and Sarnak found infinite families of Ramanujan graphs of fixed degree. We will 
say a bit more about these examples at the end of this section See also Guiliana Davidoff, 
Peter Sarnak, and Alain Valette [23] or Terras [116]. Denis Charles, Eyal Goren, and Kristin 
Lauter [13] give applications of Ramanujan graphs to cryptography. 

Of course, one really wants infinite families of Ramanujan graphs of fixed small degree. 
The finite upper half plane graphs X,(0, a) have degree q+ 1 provided that aA0O or 40. 
These finite upper half plane graphs were proved to be Ramanujan by N. Katz using work of 
Soto-Andrade and estimates of exponential sums. See Terras [116] for more of the history. 
Ramanujan graphs are also good expander graphs, meaning that if they form a gossip 
network, the gossip gets out fast. Sarnak [99] states: “it is in applications in theoretical 
computer science where expanders have had their major impact. Among their applications 
are the design of explicit superefficient communication networks, constructions of error- 
correcting codes with very efficient encoding and decoding algorithms, derandomization 
of random algorithms, and analysis of algorithms in computational group theory ...” The 
subject of expander graphs of this sort has an accessible introduction in the book by Mike 
Krebs and Tony Shaheen [61]. 

Now we can explain Figure 5.2 in Section 5.1. The picture is that of points (x,y), with 
x,y €F,,, and yA 0. Take 6 € F,, to be a non-square. View a point (x,y) asz=x+yVdE 
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Aya, C Fii[V6]. Let 2 x 2 matrices g= & : € GL(2,Fy,) act on z=x+ yV5 by frac- 


tional linear transformation gz = ath Color two points z and w the same color if there is a 


matrix g € GL(2,F1;) such that w=gz. This gives the picture in Figure 5.2. Figure 8.13 
is another version of that figure. This figure is reminiscent of tessellations of the real 
Poincaré upper half plane H obtained by translating a fundamental domain D =IT'\H around 
using elements of the discrete group I’ which is usually some group like a subgroup of the 
modular group SL(2, Z). There are some beautiful tessellations on Helena Verrill’s website: 
www.math.lsu/~verrill/. Such tessellations of the unit disc inspired many pictures by the 
artist M. C. Escher. The tessellation of the Poincaré upper half plane by SL(2,Z) is visible 
in Figure 2.7. 


Figure 8.13 Another version of Figure 5.2 


Exercise 8.3.6 Apply Burnside’s lemma from Section 3.7 to GL(2, Fp) acting on H, to find 
out how many colors need to be used in creating the analog of Figure 5.2 for an arbitrary 
odd prime p. 


As we have said, Lubotzky, Phillips, and Sarnak along with Margulis constructed the first 
infinite families of Ramanujan graphs. We consider these examples briefly. The construction 
involves the projective general linear group over Z,, for prime q, defined by PGL(2, Z,) = 
GL(2, Z,)/Z, where Z denotes the center of GL(2,Z,). Here, as usual, GL(2, Zq) consists of 
all 2 x 2 matrices with entries in Z, and nonzero determinant. 


Exercise 8.3.7 Describe the elements of the center of GL(2, Zq). 


The Lubotzky, Phillips, and Sarnak construction requires two distinct odd primes p and 
q- both congruent to 1 modulo 4. Then take i to be any integer such that i? = —1 (mod q). 
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It can be shown (using an old formula of Jacobi proved using theta functions) that this 
implies that there are p+ 1 vectors 


(a,b, c,d) €Z* such that p=a? +b? + +a? (8.15) 


where a is odd and positive, while b,c, d are all even. 
If, in addition, p is not a square mod gq, then define the Cayley graph to be X?4= 
X(PGL(2, Z,), S), where S is the subgroup consisting of the matrices 


a+ib  c+id 

—c+id a—ib)’ 
with (a,b,c, d) as in (8.15). When q> 2,/p, Davidoff, Sarnak, and Valette [23] show that 
S does have degree p+ 1. When p is a square mod q, then, in the definition of X?4, 
the group PGL(2,Z,) is replaced with PSL(2,Z,)=SL(2,Z,)/Z, where Z is the cen- 
ter of SL(2,Z,). Here SL(2,Z,) is the determinant 1 matrices in GL(2,Z,). Davidoff, 
Sarnak and Valette [23] show that if p >5, q>p®, then the graph X?*4 is connected. They 
also give proofs of lower bounds on some of the interesting graph-theoretic constants 
of these graphs but do not manage to give a proof of Ramanujanicity. A proof that 
these graphs are indeed Ramanujan can be found in Sarnak [98]. Quaternion algebras 
are involved in the creation of these graphs as well as the proofs. One also needs the 
Ramanujan conjecture on the size of Fourier coefficients of modular forms - at least a spe- 
cial case that was known before Deligne’s proof of the conjecture. See Sarnak [98] for more 
information. 

Even the smallest examples of the Lubotzky, Phillips, Sarnak graphs X?’? are quite large. 
Thus it is of interest that Sarnak’s student Patrick Chiu managed to do the case of 3-regular 
graphs (see [14]). In the case that q=3, he gets the smallest such Ramanujan graph X?3 
which can be constructed in a similar way to the Lubotzky, Phillips, and Sarnak graphs 
above. The generating set S is 


5 10 1 2 01 
We ee Ne OP Aa ay” 
The Cayley graph is then X?" = X(G, S) with G= PGL(2, Zs). 


Exercise 8.3.8 Draw the Cayley graph X?3 and compute the spectrum of its adjacency 
matrix. Prove that it is a Ramanujan graph. 


Exercise 8.3.9 Show that in the Lubotzky, Phillips, and Sarnak construction above, when p 
is a square mod q, the matrices 


a+ib  c+id 

—c+id a—ib 
do not generate PGL(2,Z,), since they are in a subgroup of index 2. This means that the 
Cayley graph X (PGL(2,Z,), 5) is not connected in this case. 
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Exercise 8.3.10 Show that in the Lubotzky, Phillips, and Sarnak construction above, when 
p is not a square mod q, the Cayley graph X?:4= X(PGL(2,Z,),S) has q(q? — 1) vertices, 
while the graph X?4 has half that many vertices when p is a square mod q. 


Exercise 8.3.11 Consider the case p=5, q=13 of the Lubotzky, Phillips, and Sarnak con- 
struction above. What is the degree of this graph X*'3? How many vertices does X°:3 
have? Is the graph connected? If you have a computer available, compute the spectrum of 
the adjacency matrix of X°? and check whether the graph is indeed a Ramanujan graph. 
If possible, draw the graph. 


Before leaving the subject of Ramanujan graphs, it is perhaps time to say a bit about 
Srinivasa Ramanujan. He lived from 1887 to 1920 and is one of the few mathematicians who 
has had a movie devoted to his life. That movie was based on the book by Robert Kanigel 
[52]. During his short life Ramanujan produced several large notebooks of mathematical 
formulas, without proofs - perhaps because he could not afford the extra paper - or maybe 
because it was enough that the formulas had been revealed to him by his goddess. Most of 
his life was spent in India but for five years starting in 1914 he was in England, working with 
the analytic number theorist G. H. Hardy. The sort of number theory done by Ramanujan was 
rather different from that of Hardy and the Ramanujan conjecture on the size of Fourier 
coefficients of modular forms seems to have been considered a “backwater” by Hardy, 
though (once generalized beyond all bounds) it has been a central topic in modern number 
theory. See Terras [118] for more information on this subject. 


8.4 Eigenvalues, Random Walks on Graphs, and Google 


Notation. All the vectors in this section will be column vectors in C”. Thus our matrices 
Ae C"*" will act on the left — taking vectors ve C” to the vector Av, as in formula (1.3). 
For a matrix M€C"%", the transpose of M is denoted 'M. 


We have considered the spectrum of a matrix in Section 4.1. First we review the defini- 
tions. Given an n x n matrix A whose entries are complex numbers, we say that 4 € C is an 
eigenvalue of A iff det(A — AJ) =0, where J is the n x n identity matrix. This is the same 
thing as saying that the matrix A — A/is singular; or that Ar= Ax for some nonzero column 
vector + € C”. Then we say x is an eigenvector of A corresponding to the eigenvalue \. The 
set of all the eigenvalues of the matrix A is called the spectrum of A. We denote it spec (A). 
The name eigenvalue comes from D. Hilbert in 1904. Many other words have been used. 
P. Halmos (in [39, p. 102]) said: “Almost every combination of the adjectives proper, latent, 
characteristic, eigen, and secular, with the nouns root, number, and value has been used 
in the literature ...” Modern computer software such as Matlab, Mathematica, Scientific 
Workplace will find approximations to eigenvalues of large matrices. 


Exercise 8.4.1 


(a) Find the eigenvalues of the following matrices: 


Goaen ate! 


Do this exercise by hand - no computers allowed. 
(b) Show that, for any square complex matrix A, spec(A) = spec(TA). 


Applications of Rings 


If the following exercises seem too terrible, you can find them in most linear algebra 
books. 


Exercise 8.4.2 


(a) Show that for any matrix A € C"*" there is a unitary matrix U (meaning that 'U U=1) 
and an upper triangular matrix T, with U,T € C"*", such that A= 'U TU. This is called 
the Schur decomposition of A. Since 'U = U~', this says that the matrix A is similar 
to T, that is, conjugate to T in the general linear group GL(n, C). 

(b) Then show that if A= 'U TU as in part (a), the diagonal entries of T are the eigenvalues 
of A. 


Hint on (a). We know that det(A — AI) =0 has a root A, by the fundamental theorem of 
algebra. Therefore there is a corresponding eigenvector v;/A0 such that Av; = A,v,. Upon 
multiplying v, by a scalar, we may assume that ||v,||=1. Complete v, to an orthonormal 
basis {1,V2,...,Un} of C” using the Grams-Schmidt process that is to be found in Strang 
[115], for example. Then U, =(v,2---v,) is a unitary matrix. And U, 'AU, = @ a ) : 

2 
where A, €C\"-!)*("-1)_, Use induction on n to complete the proof. 


Exercise 8.4.3 


(a) Suppose that the matrix A € C"*" is Hermitian meaning that'A =A. Show that then the 
upper triangular matrix T in the Schur decomposition of A from the preceding exercise 
can be taken to be diagonal. This is the spectral theorem. 

(b) Show that the eigenvalues of a Hermitian matrix are real numbers. 


Exercise 8.4.4 Suppose that AC C"*" has n pairwise distinct eigenvalues d1,...,An €C. 
Then the corresponding eigenvectors are linearly independent over C. This implies that 
there is an invertible matrix VEC"*" such that A= V—'! DV, where D is diagonal with ith 
diagonal entry 4,. 


There are many applications of these concepts to engineering, physics, chemistry, statis- 
tics, economics, music - even the internet. Eigenvalues associated to structures can be used 
to analyze their stability under some kind of vibration such as that caused by an earthquake. 
The word “spectroscopy” means the use of spectral lines to analyze chemicals - as discussed 
earlier in Section 4.2. We will investigate one such application in this section - that of the 
Google search engine. References for this section include: Google’s website, G. Strang [115], 
C. D. Meyer [77], R. A. Horn and C. R. Johnson [45], Amy N. Langville and Carl D. Meyer 
[66], D. Cvetkovic, M. Doob, and H. Sachs [22], and A. Terras [116]. 

This section concerns real and complex linear algebra, the sort you learn as a beginning 
undergrad, for the most part, except for the Perron-Frobenius theorem. We will not be 
thinking about matrices with elements in finite fields in this section. Usually our matrices 
will have elements that are non-negative real numbers. That happens because our matrices 
will be Markov matrices from elementary probability theory. Markov invented this concept 
in 1907. Markov chains are random processes that retain no memory of their past states. 
An example is a random walk on the pentagon graph in Figure 8.14 below. References for 
the subject are J. C. Kemeny and J. L. Snell [53] and J. R. Norris [82]. 
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At time t=0, the big penguin is at vertex 1. 
At time t= 1 the penguin has probability 5 
of being at vertex 2 and probability i of 
being at vertex 5. So the penguins at these 
vertices are half size 


( Figure 8.14 A random walk on a pentagon. 


If we start our random walker at vertex 1, that corresponds to the probabil- 
ity vector Ty = (1,0, 0,0, 0). Then at time t=1 the creature is either at vertex 
2 or 5 with equal probability. That corresponds to the probability vector v= 
Mv, 'v;=(0,0.5,0,0,0.5). Then at time t=2 we have the probability vector 
V,=M?v9, = (0.5000, 0, 0.2500,0.2500,0). At time t=3 we have v;=M?u, 
Ty; = (0, 0.3750, 0.1250, 0.1250, 0.3750). Continue in this manner up to time t=10 
and you find that vjg=M!°r, ‘yo = (0.2480, 0.1611, 0.2148, 0.2148,0.1611). Already 
we see that we are approaching the eigenvector ;w = (0.2, 0.2, 0.2, 0.2, 0.2), which is 
the probability that the poor creature is totally lost, also known as the uniform prob- 
ability distribution. The speed of convergence to the uniform probability vector is gov- 
erned by the second largest eigenvalue which is 0.8090 in this case. You need a time 
t such that 0.8090‘ is negligible (depending on what metric you use on the space 
of vectors in R°). Anyway, for our example, at time t=30, the probability vector 
is 39 =M?°v9, 39 = (0.2007, 0.1994, 0.2002, 0.2002, 0.1994) which is close enough to 
u = (0.2, 0.2,0.2,0.2,0.2) not to be able to notice the difference in a picture. Note that 


0.80907° & 0.00173. The actual Euclidean distance between the two vectors is 


\|v30 — ull, = (0.0007) + 2(0.0006)” + 2(0.0002)” ~ 0.03. 


If our graph were the web, we would be saying all websites have the same rank since all 
the coefficients of the steady-state vector u are equal. 


Exercise 8.4.7 


(a) Prove that 1 is an eigenvalue of any symmetric or non-symmetric Markov matrix M. 
(b) Show that if X is any eigenvalue of a symmetric or non-symmetric Markov matrix M, 
then |\| <1. 
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(c) Suppose, in addition that if \A 1 is an eigenvalue of the symmetric Markov matrix M, 
then |X| <1. Then there is is an orthonormal basis B= {v,,...,v,} of R" consisting of 
eigenvectors of M such that Mv, = v,. Prove that then 


1 1 
lim M’y =u, where u= "eazy (8.18) 
k- 00 n 


for any probability vector v. Note that the speed of convergence in this limit is measured 
by the second largest eigenvalue of M in absolute value. Here we measure the distance 
between two vectors v and u in R" by the usual distance ||v — u\|, = ./1 (v — u) (v—u). 


n 
Hint. Using the orthonormal basis of eigenvectors v; of M, write v=) apy, for aj ER. 
j=l 
Note that a;=(v,v;), the inner product of v and v;. Apply ME to both sides and take the 
limit as t— oo. 


The following exercise gives examples of symmetric Markov matrices for which the 
hypothesis for the eigenvalues \£ 1 in part (c) of the preceding exercise is false. The 
conclusion is false as well. The Google matrix will have to be constructed to avoid such 
behavior. 


Exercise 8.4.8 


(a) What happens if you replace the pentagon in the preceding example with a square? 

(b) More generally consider the random walk on _ the Cayley graph 
X(Zn, {+1 (mod n)}) in which the random walker at vertex x has equal probability 
of moving to vertex x+ 1 (mod n) or to verter x— 1 (mod n). Show that if you want 
the random walk to converge to the uniform probability vector u= Th, nd 1) you 
will need to take n odd or change the random walk to allow the walker to have three 


choices at each step, one being to stay at vertex x. 


Hint. Recall from Section 4.2 on the finite Fourier transform that we can use the addi- 
tive characters Yq(x) = e2"'*/" to find the eigenvalues of the adjacency matrices of such 
Cayley graphs to be 2cos(2ma/n), for a=0,1,...,n— 1, in the case that n is odd, when 
2. cos(27a/n)A —1. 


Now we want to apply similar reasoning to a random walk on an extremely large directed 
graph. If you websurf to www.google.com and type in some words such as “eigenvalues,” 
you will get a long list of websites, ordered according to importance. How does Google 
produce the ordering? Google had to take over 8.1 billion webpages and rank them - as of 
2006, according to Amy N. Langville and Carl D. Meyer [66]. This was up from 2.7 billion 
in 2002. If you Google the question “how big is the Google matrix?” you get many answers. 
Most recently (September, 2016) I found that Google was indexing >30 trillion webpages 
and that the number increases by a factor of 30 every five years. Presumably this number 
does not include the “Deep Web.” Do not expect this text to discuss that subject. There is 
a website - www.internetlinestats.com - which claims to give the current numbers. It was 
around 1 billion websites when I looked in May of 2016. Note that a website may contain 
many webpages. We should also note that the algorithm we discuss mostly comes from the 
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ten year old book by Langville and Meyer [66] and thus may bear only slight resemblance 
to the algorithm used by Google at the moment. Much is top secret of course. 

Google seemingly does its page rank computation once a month. Google is a name close 
to googol which means 101°. Google was invented by two computer science doctoral stu- 
dents (Sergey Brin and Larry Page) at Stanford - in the mid-1990s. They only use ideas 
from a standard undergraduate linear algebra course, plus a bit of elementary probability. 
They view the web as a directed graph with a web surfer randomly hopping around. The 
main idea is that the more links a website has to it, the more important it must be (these 
links are called “inlinks”). Figure 8.15 shows a tiny web with only five websites. The web- 
pages are the vertices of a directed graph. An arrow from vertex x to vertex y means that 
vertex x contains a link to vertex y. So in the example of Figure 8.15 you might think 
vertex 5 is the most important, since it has the most arrows going to it. In short, if x, is the 
number of links to site k, then r= (1,2, 1, 2,3). 


—_—> 
ie—e 


\ 


vane 


On the other hand, node 5 is what is called a “dead end.” It has no links to any other site. 
If we imagine a web surfer bouncing around from webpage to webpage, that surfer will 
land at node 5 and have nowhere to go. Many webpages are like this: for example, pdfs, 
gifs, jpgs. 

We want to create a Markov matrix to give the transition matrix for a random web surfer. 
Let us first ignore the problem of node 5 and just look at the matrix H whose i,j entry is 


Figure 8.15 Surfing a very small web 


1 
hj = \ #(arrows going out from site j) 
0 otherwise. 


if there is an arrow from site j to site i 


(8.19) 


For the webgraph of Figure 8.15, we get 


jo) 


0 
0 
0 


q 
x 0 


This is almost a Markov matrix except that the entries of the last column do not sum to 1. 
For the Google method to work, it would be nice to have an actual positive Markov matrix 
M - meaning that every entry of M is positive. For we want it to satisfy the hypotheses of 


(8.20) 
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Theorem 8.4.1 below which would imply that the largest eigenvalue is 4; =1 and that \, 
is an eigenvalue with a non-negative probability eigenvector corresponding to the steady 
state (i.e., limiting) behavior of the associated Markov chain. This theorem also importantly 
says that the other eigenvalues of M satisfy |,| < 1. 

You may want to use Matlab or Mathematica or Scientific Workplace (or whatever) to do 
the matrix computations in the following exercises. 


Exercise 8.4.9 


(a) Find the largest eigenvalue and a corresponding positive eigenvector for the matrix H 
in (8.20) above. Since H is not a Markov matrix, do not expect r, = 1. 

(b) What is the interpretation of this as far as ranking the websites? Which site is most 
important? 

(c) Follow a web surfer who starts at site 1 through 20 iterations. That is, compute H*v, 
= T1500 0.0 ea Bint AO 

(d) What seems to be happening to the vector H*v in the limit as k > 00? 

(e) Can you use the eigenvalues of H to explain what is happening in part (d)? 


Exercise 8.4.10 To produce a Markov matrix, one Google idea is to replace the last column 
in the matrix H of (8.20) above by a column with 1/5 in each row. Call the new matrix S. 
Now it comes closer to satisfying the hypothesis of Theorem 8.4.1 below. 


(a) Write down the matrix S. Does some S* have all positive entries? 

(b) Compute a probability eigenvector of S corresponding to the eigenvalue 1. Which site 
does this eigenvector say is the most important? 

(c) Follow a web surfer who starts at site 1 through 20 iterations. What is the limit of S*v, 
for v= 1(1,0,0,0,0), as kR-+ 00? Compare your answers with those in part (b). 


Google has one more trick. The matrix S obtained in Exercise 8.4.10 need not be such 
that all its entries are positive, which is the hypothesis of Perron’s theorem (Theorem 8.4.1) 
below (although the weaker hypothesis that S* has all positive entries for some k will 
also work - but is harder to check on a matrix which is 30 trillion x 30 trillion or even 
8 billion x 8 billion). The new Google trick will also affect the second largest eigenvalue 
in absolute value. Suppose we have n nodes in our internet. In the formulas that follow the 


vectors are column vectors. Let b be the an n-vector whose jth component (j=1,..., 7) is 
1 if site j has no arrows going out (i.e., it is a dead end), 
I~ te otherwise. ee) 
Define 
S=H+ ~e'b, (8.22) 


where e is a n-vector of 1s and H is defined by (8.19). The Google matrix is given, for 
0<a<1, by setting 


1 
G=aS+(1—a)—, (8.23) 
where J is an n X n matrix all of whose entries are 1. 


Exercise 8.4.11 Prove that the Google matrix G is a Markov matrix. 
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Exercise 8.4.12 


(a) Write down G in formula (8.23) for the example of the small web in Figure 8.15 for 
a =0.9 and then compute a probability eigenvector for G corresponding to the eigenvalue 
1. Note that the entries of G are all positive. Which site does this eigenvector say is the 
most important? 

(b) Follow a web surfer who starts at site 1 through 20 iterations. What is the limit of 
Gky, for v= 1(1,0,0,0,0), as R00 ? Compare your answers in Exercises 8.4.9 and 
8.4.10. 


Notes. In the formula for G, Google chooses a =0.85 - or it did 10 years ago or so. It could 
be any number between 0 and 1. If a=0.85, it means that 85% of the time the web surfer 
follows the hyperlink structure of the web and the other 15% of the time the web surfer 
jumps (teleports) to a random webpage. Since 1/(8.1 billion) is small, the alteration in the 
entries of the matrix H is not enormous. Of course the Google version of this matrix will be 
8.1 billion x 8.1 billion - at least that was the approximate size in 2006. Now it may be 1000 
times that large, or more. How does Google find the dominant eigenvector or page rank of 
G? It uses a very old method called the power method which works well for sparse matrices 
(meaning matrices most of whose entries are 0). That is, Google uses the fact that we noticed 
in the preceding problems. If you just keep multiplying some fixed probability vector v by 
G - in effect, computing G*v - this should converge to the probability eigenvector for the 
eigenvalue 1. 

In the case that the Markov matrix M is symmetric, the power method basically takes 
advantage of equation (8.18) which says that for arbitrary probability vectors v, the vectors 
M"v approach u= ee my 1), the steady-state of the Markov chain, as n— oo. Why? 
For a non-symmetric positive Markov matrix M, an analogous result comes from Theorem 
8.4.1 below. But in the case of a non-symmetric Markov matrix the stationary state vector 
will not have all entries equal and that will of course give the website rankings. The power 
method was published by R. von Mises in 1929. 


Exercise 8.4.13 Suppose that the matrix A €R"*" has n linearly independent eigenvec- 
tors v; CR" with Av;=,v;. Suppose that || >|A2| >--- > |An|. Show that, for any vector 
we R", 

1 


lim pAtw= Bv,, 


k- oo \A1| 


for some scalar B ER. 
n 
Hint. Write w= Sorv% for y;€ IR. Apply A® to both sides and take the limit. 
j=l 


However, the power method is “notoriously slow” for non-sparse matrices like G. So why 
does it work for Google? The first part of the answer has to do with the fact that is proved 
in the next exercise. Google only needs to compute the iterates of sparse matrices and not 
G itself. 

The second part of the answer says that this method requires only about 50 iterations for 
the huge matrix Google is dealing with. Why should this be? This has to do with the size 
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of the second largest eigenvalue A. of G in absolute value. It turns out that for the Google 
matrix, |A,|= a with a from formula (8.23). 

Choosing a = 0.85, one finds that 0.85°° ~ 0.000296. This means that Google can expect 
about 2-3 places of accuracy in the page rank vector after about 50 iterations of replacing 
v by Gv. 


Exercise 8.4.14 Here we are trying to understand part of the second-to-last paragraph. 
Consider a web with n sites. Let H be the matrix whose i,j entry is defined by formula 
(8.19). The Google matrix is as in formula (8.23) with b as in formula (8.21) and S as 
in formula (8.22). Again suppose that e is an n-column vector of 1s. Show that if v is 
any probability (column) vector, meaning its entries are >0 and sum to 1 (which implies 
Te y=1), we have 


1 
Gu=aHv + —(a Thv+1—a)e. (8.24) 


Note that H is sparse (with on average only about 10 nonzero elements in a column) and 
the scalar 'b v is easy to compute. It follows that iterating v + Gv will be quickly computed. 


The next exercise is an attempt to explain “Google bombing.” To do this, people are 
paid to set up link farms to fool Google into thinking a webpage is more important than it 
otherwise would appear to be. Google attempts to find such occurrences and then give such 
pages lower ranks. It was sued for doing so in 2002. The lawsuit was dismissed in 2003. 
See the book of Langville and Meyer [66] for more information. Now Google claims to be 
using many (200) factors to rank sites - not just the page rank. 


Exercise 8.4.15 In the example of the small web in Figure 8.15, suppose the site-1 people 
are angry to be rated below site 5. To increase the rating of site 1, they create a new site 6 
with a link to site 1. Site 1 will also link to site 6. Does this help site 1’s ranking? 


(a) Find the new H matrix from formula (8.19). Then form the S matrix in formula (8.22). 
Finally form the G matrix as in formula (8.24) with a= 0.85. 

(b) Find the probability eigenvector of the G matrix corresponding to the eigenvalue 1. 

(c) Would it help if site 1 created another new site with a link to site 1? Would it help more 
if we added a new site with 10 links to site 1? 

(d) What can Google do to minimize the effect of this sort of thing? 


The Perron theorem was proved by Perron in 1907 and later generalized by Frobenius in 
1912. The general version is called the Perron-Frobenius theorem. We give only a special 
case. To see the general version, look at Horn and Johnson [45]. 


Theorem 8.4.1 (Perron). Suppose that the n x n Markov matrix M has all positive entries. 
Then the following hold. 


(1) 1 is an eigenvalue of M and the corresponding vector space of eigenvectors is one- 
dimensional. 

(2) If the eigenvalues of M are listed as \, =1,A2,...,An, then 1=|A,| > |Aj|, for all 
ee 
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(3) There is an eigenvector v, corresponding to the eigenvalue 1 which is such that all 
its entries are >0O and they sum to 1. 

(4) The vector v, is the steady state of the Markov chain with transition matrix M: 
that is, 


lim M’x=v,, for any probability vector x. 
T0090 


It is easy to see that if M is a Markov matrix, 1 is an eigenvalue of the transpose, 'M, 
with eigenvector 1(1, 1,..., 1). That is because the columns of M sum to 1. The eigenvalues 
of 'M are the same as the eigenvalues of M, since the determinant of a matrix is the same 
as the determinant of its transpose. For the rest of the proof, see Meyer [77, Chapter 8], 
or the last chapter of Horn and Johnson [45]. 


Exercise 8.4.16 


(a) Prove the Perron theorem in the case that the positive Markov matrix M is 2 x 2. 

(b) Prove that if the spectrum of the Markov matrix S from Exercise 8.4.14 is 
{1,A2,---,An}, then the spectrum of the Google matrix G=aS+(1- a) +S is 
{1,a@r2,..., An}. 


Hint on (b). The proof can be found in Langville and Meyer [66, p. 46]. The trick is to make 

clever use of block matrix multiplication. We know e= '(1,...,1) is an eigenvector of 

R= 'S corresponding to the eigenvalue 1. Replace the Google matrix by its transpose 'G= 

aR + (1—a)u-'v, where v= he. Now spec(R) = spec(S) ={1,2,-.-, An}. If Q=(e X) is 
T 


a non-singular matrix with e as its first column and Q~! = (1) we have 


1 T 


o-'ro= ( ae and Q7!1GS= € * 


0 *yRx ) aon 
It follows from this that spec ('YRX) = {Ao,..-,An}- 


Exercise 8.4.17 


(a) If a real matrix A has non-negative entries, write A >0. If A has positive entries write 
A> 0. Prove that if A> 0 and x>0,x/0, then Ax>0. 

(b) If A—B>0, write A> B. Show that if N>0 and u>v>0, then Nu>Nv>0. 

(c) Is < an equivalence relation on n x n real matrices? Is < a partial order on n x n real 
matrices? 


You might still ask how Google finds the webpages with the words you typed. Google 
answers on its website that it has a large number of computers to “crawl” the web and 
“fetch” the pages and then form a humongous index of all the words it sees. So when we 
type in “eigenvalue” Google’s computers search their index for that word and the page rank 
of the websites containing that word among “200 factors.” That is what I found in 2011. 
Maybe things have changed a bit when you are reading this. 

We have one last question to ask concerning Google. Are they really living up to their 
motto — Don’t be evil? Is making money from ads that may contain lies possibly evil? 
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8.5 Elliptic Curve Cryptography 


Before beginning our discussion of elliptic curves and the cryptography they enable, we list 
a few references: Ramanujachary Kumanduri and Cristina Romero [62, Chapter 19], Neal 
Koblitz [55], Joseph H. Silverman [105], [106], and Joseph H. Silverman and John Tate [107]. 
Google.com gave us many hits when we typed in “elliptic curve cryptography.” 

What is an elliptic curve? First it is not an ellipse. The connection with ellipses responsible 
for the name is in the computation of the arclength of an ellipse. 

Let K be any field, for example K=R, C, or Q or Z/pZ, p prime. We will mostly be 
interested in finite fields here. 


Definition 8.5.1 Assume a,b,c K. An elliptic curve E = E(K) is the set of points (x,y) 
with x,y in K such that 


=r + ax + bre. 


We omit some technical conditions, which we will soon be forced to consider. You can 
also replace the y” on the left with some other quadratic polynomial in y. 

The real points on E(R) are of interest. They will help us to visualize what we are doing 
over finite fields. So we draw some pictures of elliptic curves over R in the plane. Figure 8.16 
shows the real points (x,y) on the elliptic curve y* = x? + x*. Figure 8.17 shows the real 
points (x, y) on the elliptic curve y* = «° — x+ 1. Figure 8.18 plots the real points (x, y) on 
the elliptic curve y’ =x? — x. 


y yar er? 


Figure 8.16 Real points (2, y) on the elliptic curve =x +7 


Exercise 8.5.1 Plot the real points (x,y) on the elliptic curve y? =x* +x and any other 
curves you find interesting. 
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y Figure 8.17 Real points (x,y) on the 
elliptic curve y? = r —x+1 


Figure 8.18 Real points (z,y) on the elliptic curve y= — x 


It is useful to replace the plane with the projective plane. In general if K is some field, 
projective n-space is obtained by looking at points x = (%0, 41, %2,.-.,%n) € K"*' with x40, 
and setting up an equivalence relation r~ tiff += at, for some a € K. Here at means the 
usual multiplication of a vector t by a scalar a. 
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Projective n-space over K is the set of equivalence classes of K"*! — 0 under the equiv- 
alence relation of the preceding paragraph: that is, P,,(K) = (K"*! — 0)/ ~. This allows us 
to replace our elliptic curve E(K) with a curve in the projective plane P2(K): 


yartart+bric 


becomes 
(y/z)* = (x/z)? + a(x/z)? + br/zt+ec (8.25) 


or 
ywz=x? + az brz* +23. 


We will identify (x,y, 1) in P2(K) with (x,y) in K?. So we view K’ as a subspace of the 
projective plane called affine space. The line at infinity in P,(K) consists of the equivalence 
classes of points (x, y,0) in P2(K). The intersection of this line with the elliptic curve of 
equation (8.25) has x = 0. Then the equivalence class in P2(K) containing the point (0, 1,0) 
is called the point at infinity. View it as a point on the intersection of the y-axis and the 
line at infinity in P,(K). See Silverman and Tate [107]. 


Definition 8.5.2 Suppose a,b,c EC. If 


fear + ar +br4+ c= = 1) @=—n)(e—t), 


then the discriminant of f is Af=(m — 12)?(r2 — 73)?(11 — 73)*. One can show that the 
discriminant is 


Af=a’b* + 18abe — 27c? — 4a*c — 4b°. 


This formula is proved for the case that f(x) =? + bx+ c in Birkhoff and Maclane [9, 
pp. 112-113] using the formula for the roots of the cubic. Mathematica will compute such 
things. An extra factor of 16 may appear in the discriminant of an elliptic curve. See, for 
example N. Koblitz [56, p. 26] or the extensive tables of J. E. Cremona, Algorithms for Mod- 
ular Elliptic Curves (www.warwick.ac.uk/staff/J.E.Cremona/book/fulltext/index.html or the 
website http://l-functions.org). 

In order to create a group associated with our elliptic curve, we need to be able to draw 
tangents to our elliptic curves. The tangent lines to the curve y= f(x) will be undefined at 
points r where both f(r) and f’ (x) vanish. That is, the tangent is undefined when ris a double 
root of fand thus the discriminant vanishes. An elliptic curve for which the discriminant 
is nonzero is called non-singular. For example, the curve y* =2° + x? in Figure 8.16 does 
not have a well-defined tangent at the origin and its discriminant is 0. We will want to 
avoid curves with such points and, when we consider curves over finite fields, we will 
avoid primes like 2 and 3 and those fields of characteristic p dividing the discriminant. We 
really want to avoid double and triple roots over the finite field in order to be able to add 
points on the curve. See also Silverman [105, p. 233] who explains in a footnote that there 
are ways of turning bad primes into good primes. See Silverman and Tate [107]. 

Dummit and Foote [28, pp. 534-536] give a bunch of exercises involving the resultant of 
two polynomials over any field explaining why the discriminant detects multiple roots. 
Let us consider the special case of interest here. Suppose f(x) =2? + ax? + br+c and 
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g(x) =r? + sx +t, with a, b,c, 7,5, tin a field F. We assume that ré 0. Then the resultant 
R(f,g) is 


2 
oa 


c 

b 

det s t O 
t 


Ors 


If f(x) =(x- a)(4 — B)(x — y) and g(x) =1(x — p)(x-— 0) for roots a, 8,7, p,0 in some 
extension field of F, then one can show that we have 


R(f,g) =r (a— p)(a—o)(8 — p)(B- o)(y — p)(y—- 2) 
=P fle)flo) = 9(a)9(8)g9(7)- 


It follows that R(f,g) =0 iff {a,8,y}O{p,co}A 0. Thus R(f,f’) =0 iff f has a root of 
multiplicity greater than 1. In fact, the discriminant of the cubic polynomial f(x) is —R( ff’). 
You can easily compute these things with Mathematica or Scientific Workplace. 


(8.26) 


Exercise 8.5.2 


(a) Evaluate the resultant R( f,f’) for the polynomial f (x) =x + ax + b. It will be a 3 x 3 
determinant. 

(b) Explain how to get the analog of formula (8.26) for R( f,g) if g(x) =1x + s, with a, b,1,s 
in some field F. 


Exercise 8.5.3 Evaluate the resultant R( f,f’) for the polynomial f(x) =x° + ax’ + br+c. 
It will be a 5 x 5 determinant. 


What is the group of an elliptic curve? To associate an Abelian group G to an elliptic 
curve E(K) where K is any field - for us K=Q or F, - the simplest way is to say that 
three points p, q,r on E(K) add to 0 iff they lie on a straight line. See Figure 8.19. We 
define the identity 0 to be the point at infinity on the curve. Then if p=(x,y), we see 
that —p = (x, —y). Think of 0 as a point infinitely far up any vertical line. If you need to 
compute 2p=p-+ p, then define the intersection of the curve and its tangent at p to be 
—2p. Of course, this makes sense over R. To figure out what is happening over a finite 
field, we just use the formulas derived from those over the real field. We will make this 
more precise in the examples below. The curve must have a well-defined tangent at every 
point for our construction to work. Thus the curve in Figure 8.16 is a bad one, since there 
is no well-defined tangent at the origin. 


Theorem 8.5.1 The preceding definition makes the non-singular elliptic curve into an 
Abelian group. 


Proof sketches are given in Kumanduri and Romero [62, p. 496] and Silverman and Tate 
[107]. The big problem is the proof of the associative law. The simplest proof of this law 
uses results from algebraic geometry on numbers of points on intersections of curves. One 
could also make use of equation (8.32) for adding two distinct points A, B and then throw 
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Figure 8.19 Addition A + B= C on the elliptic curve y° =x? — x over R 


in a third distinct point C. You might want to use the symbolic computational power of 
Mathematica to see the symmetry of the final result in A, B, C. Then there would be other 
cases to consider: for example, A = B. 


Example. Look at y? + y=.2° — 2? over the field Q. This curve has five rational points: 
a=(0,0), b=(1,-1), c=(1,0), d=(0,-—1) and the point at infinity denoted 0. 
Can you prove it? The finite real points are shown in Figure 8.20. A 
The group G of this curve over Q turns out to be the cyclic group of order 5 generated 
by a. First, you can see that 2a=b. In this case, you just need to see that the tangent to 
the curve at a (which is the x-axis) intersects the curve at c. So that means a+a+c=0 


and thus a+ a=—c=b. Next note that b + c=0= the point at oo, since the line through 
band c is vertical and thus goes through the point at oo. 


Exercise 8.5.4 Compute the addition table for the group G={0,a,b,c,d} of the curve 
y + y= 23 — x7. The group G is a cyclic group generated by a. Assume that we have found 
all the rational points on the curve. 


There is a proof of the following theorem in Silverman and Tate [107]. 


Applications of Rings 


yty=r-7 


Figure 8.20 The rational points on the curve y’ + y=. — x’ are a,b, c, d and the point at oo 


Theorem 8.5.2 (Mordell). The group G associated to an elliptic curve E(Q) is finitely 
generated (not necessarily finite) Abelian. 


Thus, by the fundamental theorem of finitely generated Abelian groups, the group G of 
E(Q) is isomorphic to a direct sum Za, ®--: ® Za, © Z’. Here r is called the rank of G. The 
finite part of G is called the torsion subgroup. Kumanduri and Romero [62, Section 19.5] 
find the torsion subgroup. The rank is harder. 

The rank of an elliptic curve over Q is connected to the congruent number problem 
which is still open. The congruent number problem asks which positive integers n are such 
that there is a right triangle with rational sides whose area equals n. More precisely, the 
following two questions are equivalent: 


Question A. For every n€Z* does there exist a right triangle with rational sides whose 
area equals n? 


Question B. Is the rank of y? = 2° — nx over Q positive? 


Remarks on Elliptic Curves over the Field C 


Historically it has been important to study elliptic curves over the complex numbers. We 
will not really want to do this here as it involves complex analysis. However, we cannot 
resist giving a sketch of the basics. For the theory of elliptic curves over C, one needs 
the Weierstrass g-function. The function g(z) is a holomorphic function of z in the 
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complex plane except for a double pole at each point of a lattice L=w,Z+ w2Z in the 
plane. Moreover one has the differential equation: 


9! (2) = 49(z)? — 92.0(z) — g- 


This implies that the point ((z), o’(z)) lies on an elliptic curve F(C). The “curve” is a 
subset of C? ~R* which has one complex parameter and thus two real parameters. The 
graph would have to be drawn in four real dimensions. If you need to know, the numbers 
92, 93 are given by Eisenstein series: 


1 
=60 ———._ and g3 = 140 
” dy (mw; + nw)" . » 
(m,n)A (0,0) (m,n)A(0,0) 


1 


(mw, + nw)? 


If you set w, = 1, you get a function of w, in the upper half plane, which is a modular 
form. Mathematica knows ¢(z) as WeierstrassP[z,{gl,g2}]. This gives a one-to- 
one correspondence from z to (¢(z), 9’(z)) which takes the torus C/L (where we identify 
points in C which differ by a lattice point in L= w,Z + wZ) to the elliptic curve F(C). The 
mapping is an isomorphism of Abelian groups. The Weierstrass o-function is not usually 
covered in undergraduate analysis. However, you can find a discussion in Koblitz [56] and 
many complex analysis books. 


Elliptic Curves over the Field F, 


For our application to cryptography, we need to discuss elliptic curves over finite fields F,. 
Mostly we consider the special case that q= p, where p is prime. How do we add points on 
a curve E= E(F,) fora, b,c €Z, 


y’ =x? + ax’ + bx+ c (mod p)? (8.27) 


As we said earlier, we must avoid the prime 2 as well as the “bad primes” dividing the 
discriminant of the polynomial, which is: 


a’*b? + 18abc — 27c? — 4a*c— 40°. 


We will also assume that the prime p is larger than 3. Now let us imitate the construction 
over R. Over F,, the “curve” is just a finite set of points. 

See Figure 8.21 for an example mod 59. The red points on the 59 x 59 grid correspond to 
(x,y) such that y? =x? — x + 1(mod 59) and they do bear some resemblance to the graph 
of the same elliptic curve over the reals in Figure 8.17. Perhaps they would bear more 
resemblance if I had graphed y from —29 to 29 rather than from 1 to 59. Of course, the 
figure is periodic mod 59 in both x and y. You could fill up the plane with copies, or - better 
yet — put the whole thing on a torus. 

In general, consider points on an elliptic curve E(F,) such as the one in Figure 8.21. We 
want to write down equations for adding points. Let P= (11, y) and Q= (12,2), P+ Q= 
(43,3), with 1,4 2x. Let y — y, = u(x — x) be the line L through P and Q. Points y on the 
“line” L must satisfy 


mod p) and G =y, — px; (mod p). (8.28) 


y= ut +8 (mod 9), where w= 222" ( 
1 


Pity ed 
In equation (8.28), you have to find the inverse of x. — x, (mod p) to find the “slope” yz of 
the line L. 
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Figure 8.21 The pink squares indicate the points (x,y) on the elliptic curve y’ = x — ++ 1 mod 59. 
Points marked are: A = (15, 36), B= (22, 40), C= (32, 46), with A + B= —C= (32, 13) 


To find the third point on L and E, which is —(P+ Q), we plug equations (8.28) into 
equation (8.27) for the elliptic curve E(F,). You get a cubic equation for x: 


(ux+ B)?=2? + ax +bx +c (mod p). 
So we can find our point —(P + Q) by solving the cubic: 
fax + (=p )¥ + (b= 2n8)x+ = B? (mod p). (8.29) 
We already have two roots of f(x), namely x; and x. This means that 


Slax) = (4 — 41) (4 — 42) (4 — 43) 


8.30 
Hx? — (Ky +42 +43)? + (Xe, +1143 + 12%3)x — 11X23 (mod p). 16-30) 
Therefore if 114 2%, P=(21,y1) and Q= (12,2), we have P+ Q= (43, —y3), with 
n= w a— x; — x (mod p), (8.31) 
_ y2— M1 
¥3 = p(x3 —41) +y (mod p), where p= es (mod p). 


— x 
Again in the formula for jz, you must divide mod p. When x, = x, but PA Q, the sum is 0, 
the point at infinity. 

To figure out the rule when P= Q, look at the tangent to E at P. This is found by recalling 
that the “derivative” is the “slope” to the “tangent” and thus formally we have 


ny = 33? + 2ax + b (mod p). 
a 
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It follows that the slope - of the “tangent” to FE at the point (x,y) is 


3x7 + 2ax, +b 
= 
2y1 
If y; = 0, the “tangent” is vertical and the third point on the curve is 0, the point at infinity. 


Once more, to find the sum P + Q, substitute the equation y= zx + 6 into equation (8.27) 
for the elliptic curve. This time we have a double root at x,. So instead of (8.30) we see that 


(mod p). 


f(a =(4- my (x — 2X3) =7 — (241 + x3)2 + (x7 + 2x1%3)x — revs (mod p). (8.32) 
This implies 2P = P + P= (x3, —y3) where 


x3 =p" —a— 2x, (mod p), 


3xt + 2ax, +b 
i= 


¥3 = (x3 — x1) +9, (mod p), ai (mod p). (8.33) 
Note that equation (8.33) is just (8.32) with x, =.2, except for finding y with formal 
derivatives. 


As with elliptic curves over Q, we should really give a proof that equations (8.32) and 
(8.32) define an Abelian group operation. One could use Mathematica to help check the 
associative law or just be content that, for our examples, we really have groups. 


Example. The group of the elliptic curve E given by y? =x? + 1 (mod 5). 
It is now easy to find the points on the curve. Substitute x=0, 1, 2,3,4 and solve for 
y (mod 5). You find 


A= (0, 1), B= (0,—1), C= (2,3), D= (2,—3), 
F = (4,0) and 0 is the point at oo. 
We can use equations (8.32) and (8.33) to compute the group table for the group G of 
points on E. First note that A= (0,1) and B= (0, -1) == A+ B=0, the point at infinity. 


To find A+ CG note that the slope of the line L through A and C is p= a (mod 5). 
So 4 =1 (mod 5). Then, using (8.32), we see that 


=? —a-—x —x4,=1-0-0-2=-1=4 (mod 5). 


¥3 = w(43 — 41) +1 = 1(4 — 0) +1 =5=0 (mod 5). 


Thus A + C= (4,0) =F. 


To find C+ C, use equation (8.33) and note that in this case pp = x =3*4=2 (mod 5). 


x3 =p” —a— 2x, =4-0-—4=0 (mod 5). 
3 = w(x3 — x1) + yy =2(0 — 2) +3 =—1 (mod 5). 


It follows that 2C=A. A 


Exercise 8.5.5 Compute the rest of the group table for the group G of the preceding example. 
Is G cyclic? 


Exercise 8.5.6 Compute the group table for y* =x’ — x + 1 (mod 7). You can draw a graph 
of the curve and count the points on it (not forgetting to add in the point at infinity) or use 
the formula in Exercise 8.5.10 below telling you how many points are on the curve. 
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Example. Consider the elliptic curve y’ =x° — x +1 (mod 59) in Figure 8.21. This prime 
was chosen because x? — r+ 1=(4 + x)(13 + x)(42 + x)(mod 59). The prime 59 is the first 
prime for which the polynomial factors completely. We plot the points as red squares in a 
grid of (x, y) € F2,. For any prime, the point (1, 1) - at the bottom left in the figure - is on 
the elliptic curve. Some points can be added visually. For example, the point (17,59) - at 
the top of the figure - is of order 2 because the y-coordinate is O (mod 59). There are two 
other points of order 0 because they have the same y-coordinate. By the main theorem on 
cyclic groups and their subgroups in Section 2.5, this implies that the group of this curve 
is not cyclic. 

One can use formula (8.35) in Exercise 8.5.10 below to see that the curve y? =x? —x+1 
(mod 59) has 60 points. I used Mathematica which can compute the Legendre symbol 
defined in the same exercise. Or you can count the points in Figure 8.21. There are 59 
points visible in Figure 8.21 plus the point at infinity, which can be thought of at infinitely 
far up the y-axis. 

For some of the points in Figure 8.21, it is possible to draw a line to do the addition. 
For example the points A= (15, 36), B=(22, 40), C= (32, 46) clearly form a line. Thus 
A+B=-—C= (32,13), since 59 — 46= 13. Of course the “lines” are not always so easily 
seen — thanks to the periodicity mod 59 in each coordinate. A 


Exercise 8.5.7 Compute the discriminant of the elliptic curve in Figure 8.21. Primes p 
dividing this discriminant must be avoided when considering the curve mod p. 


Exercise 8.5.8 Use Mathematica or the software of your choice to add points on the curve 
y? =x? — x+ 1 (mod 59) 


in Figure 8.21. Show that the point (1,1) has order 30. With a little more effort you can 
create the group of this curve - a non-cyclic Abelian group of order 60. This can be done by 
first creating the cyclic subgroup Hy generated by an element left out of H, =((1,1)), and 
then continuing with the cyclic subgroup H, generated by an element left out of H,, and 
so on. The cycle diagrams generated in this way produce interesting figures. Recall that we 
considered such diagrams for the multiplicative groups Z*, in Section 2.5. As a culmination 
of this exercise, try to draw the cycle diagram. Which of the following groups is (isomorphic 
to) the group of this curve: Z3) @ Z,, Z,g BZ 19, Zy5 BZ, PB Zo, or Zi, GB Zy, or something 
else? Are these groups all non-isomorphic anyway? 


Exercise 8.5.9 Show that the elliptic curve y* =x? + ax* + bx+c (mod p) has at most 
2p + 1 points. 


Remarks on the Number of Points on an Elliptic Curve Mod p 


Define the Legendre symbol by for odd prime p by: 


= ¢ 1 if p does not divide n and n=2x (mod p) has a solution x, (8.34) 


(") 0 if p divides n, 
(—1 otherwise. 


This symbol is beloved by number theorists because of its appearance in the quadratic 
reciprocity law relating ( 1) and (2) for two distinct primes p and q. See Rosen [91] or 
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Terras [116, Chapter 8]. Gauss was so fond of this subject that he found eight proofs of the 
quadratic reciprocity law. 


Exercise 8.5.10 (Formula for the Number of Points on an Elliptic Curve Modulo a Prime). 


(a) Let f(x) =x + a? + bx +c. Show that 
f(x) = : 7 = 
1+ ee the number of solutions y to the congruence y =f (x) (mod p). 


(b) Then prove that the number of points on the elliptic curve y*=2x + ax*+br+c 
(mod p) is 


p-l 
Np=pti+y (22). (8.35) 


x=0 


(c) Use the formula from part (b) in order to see that there are six points on the curve 
y =x +1 (mod 5). 


Theorem 8.5.3 (H. Hasse, 1933). If N, is the number of points on an elliptic curve mod p, 
set dy =p + 1—Np. Then |a,| <2,/p. 


To prove this theorem, using Exercise 8.5.10, one must bound the sum +, (f(x) /p). 
One expects the Legendre symbols to be randomly +1 or —1. That leads to the heuristic 
reason for the bound. If you want a real proof, see Lang [64] or Silverman [106]. 

Much is known about elliptic curves. For example, it has also been proved that the group 
G of points on an elliptic curve (mod p) is a product of at most two cyclic groups. More 
references on the subject are: Jeff Hoffstein, Jill Pipher, and Joseph H. Silverman [44], 
Kristin Lauter [67], Karl Rubin and Alice Silverberg [96], Alice Silverberg [104], and Joseph 
H. Silverman [106]. 


Exercise 8.5.11 Find the number of points on the curve y?=x*? —x+ 1 (mod p) for all 
primes p such that 5 < p< 30 and such that p does not divide the discriminant of x° — x + 1. 


Hint. Mathematica knows how to compute the Legendre symbol (") via the command 
JacobiSymbol[n,p]. 


Exercise 8.5.12 State whether the groups of the curves y’ = x* — x + 1 (mod p) considered 
in the preceding exercise are cyclic. 


Exercise 8.5.13 State whether the group of the curve y* =x* — x+ 1 (mod 163) is cyclic. 


Hint. Mathematica can automate the addition of points. Start with a=(1,1); pr= 163. 
Compute b= 2a as follows. 


mu=Mod [(3* PowerMod[a[[1]],2,pr]-1)* PowerMod[2* a[[2]],-1,pr],prl; 
ql=Mod [PowerMod[mu, 2,pr]-2* a[[1]],pr];q2=Mod[-mu* (ql-a[[1]]) 
-alla) ler] sb={ql;a2}-. 
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From then on, one can keep adding a to b. For example if we want b= 12a, we use the 
command below to add a to b ten times. 


Do [(mu=Mod[(b[[2]]-a[[2]])* PowerMod[b[[1]]-a[[1]],-1,pr],pr]; 
r1=Mod [PowerMod [mu, 2,pr]-al[[1]] cae J],prl]; 
r2=Mod [-mu* (r1-a[[1]])-al[2]], ;b={r1,r2}),{10}] 


Elliptic Curves over Finite Fields and Cryptography 


In Section 4.1 we saw how to get public-key secret codes from the multiplicative group 
Z*q, When p and q are large primes. The security of such codes derives from the difficulty 
of factoring pq when p and q are two large primes. Elliptic curve cryptography makes use 
of the group G of an elliptic curve mod p. It seems that one can use smaller public-keys 
and still have secure messages using elliptic curve cryptography. 

Note that our analog of raising an element a€ Z,, to the kth power in G is multiplying 


an element a in G by k, or adding a to itself k times to get 


k times 


To do this fast, one can proceed in an analogous way to that with powers. For example, 
100 = 2° + 2°+ 2? 


The result is that, in order to compute 100a, we need six doublings and two additions. 
We will want to encode our plaintext m as a point P», on an elliptic curve E so that it 
will be easy to get m from Py. 


Remarks 


(1) There does not exist an algorithm for writing down lots of points on E(F,) in log p 
time. 
(2) It is not sufficient to generate random points on E(I,,) anyway. 


A Probabilistic Method to Encode Plaintext m as P,, on an elliptic curve E(I,,). We will 
illustrate the method with an example from Koblitz [55, p. 168]. The curve is 


y? + y=? — x (mod 751). 


This curve has 727 points, and 727 is a prime. Thus the group of the curve is cyclic. 


Exercise 8.5.14 Check the last statement. You will want to make use of formula (8.36) below 
completing the square on the left-hand side of the equation. 


Take a number « = 20 (or larger). The number « is chosen so that a failure rate of 1/2” 
is OK when seeking our point. We will need to represent numbers m between O and 35 
(meaning the usual alphabet plus the digits from 0 to 9): 


O21 DAB CDs, YZ. 


So we want p > 35 x & =700. Our p= 751 so that is OK. 
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Write x between 0 and 700 in the form x=m * 20 + j, where 1 <j < 20. Then compute 
y =y-+ 376 so that 


2*376=1(mod751) and 376*=188 (mod 751). 
y? =(y+ 376) =y’ + y+ 188=4x — x + 188 (mod 751). (8.36) 


Thus we need a fast way to do square roots mod p. Luckily Mathematica does square roots 
mod p. So we do not need to program this algorithm (unless we hate Mathematica). Of 


cou 


rse programs like SAGE actually know about elliptic curves. 


If we can solve for y then set P, = (x,y). Otherwise replace j by j + 1 in the formula for 
x and try again. Since our curve has 727 points, probability says we should not have to 
increment more than 20 times. Set f(x) =2° — x + 188 (mod 751). There is approximately 
a (4)?° chance that f(m * 20 +j) will not be a square for any j=1,2,...,20; assuming 
that the events f(m * 20 +j) =square and f(m * 20 + j+ 1) = square are independent - an 
unproved but reasonable assumption. 

Here is our alphabet table. 


0 


1 2 3 4 5 6 7 8 9 A |B C D E F G H 


0 


1 2 3 4 5 6 7 8 9 10 | 11 12 | 13 | 14 | 15 | 16 | 17 


I 


JI |K |L |M {IN }O |}]P |Q )/R |S |T |U JV |W |X | Y |Z 


18 


19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 


We find some points on the curve corresponding to our alphabet entries. 


(1) 


(2 


~— 


~~, 
bh WD 
= = 


(5 


a 


First m =0 gives Po. We look at x=0 * 20+ 1 and plug that into 
y? = (y+ 376)? =x? —x+ 188=1-1 + 188 = 188 = (376)? (mod 751). 


Clearly the solution is y= 0. So Po = (1, 0). 
Next we look at m=1, and form r= 1 * 20+ j, with j=1,...,20. For j=1, we have 
x= 21 and solve 


y? = (y +376) =x? —x+ 188=21? — 21 + 188 = 416 =(618)” (mod 751). 


Then y” =(618)? (mod 751) has two solutions. We take y’=618 (mod 751). Then 
y=y — 376 = 242 (mod 751). The other solution is y= 508 (mod 751). We will ignore 
it. So we get the point P,; = (21, 242). We could have equally well said (21, 508). 
Similarly we find P; = (41, 101). 

The next case is more interesting. If we set m=3 and form x=3 « 20+ /, with j=1, 
we see that when x= 61, we cannot solve the congruence 


y? =(y + 376)? =x° —x+ 188=61° — 61x + 188 =306 (mod 751). 


So we must increment j to 2 and look at x= 62. Luckily this guy is a square mod 751 
and we find that P; = (62,214) or (62, 536). 

Corresponding to the letter S is the number m = 28 (from our alphabet table). Then we 
find the point Py3 = (562, 174) or (562, 576) on the curve E. It again takes two tries. 


Of course we are using Mathematica to do this. Here is part of our Mathematica notebook. 
In versions of Mathematica from 2001, we needed to include the package 


<< NumberTheory’NumberTheoryFunctions’ 


in order to take square roots mod n. Now we just use PowerMod[a, 1/2,n]. 
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For example, to do the Mathematica calculation for the second point on the curve mod 751, 
we define the following Mathematica functions f, g, h, k and perform the calculations. 
£[x_] :=£[x]=Mod [PowerMod[x,3,751] - x + 188,751] 

g[x_] :=g[x] =PowerMod [x,1/2,751] 

h[x_] :=h[x] =Mod [x-376, 751] 

k [x_] :=k [x] =Mod [-x-376,751] 


f [21] =416 

g [416] =618 
h[618] =242 
k [618] =508 


Exercise 8.5.15 This exercise involves encrypting and decrypting messages using the curve 
just discussed. 


(a) Write the message 
THEIR LANGUAGE IS THE LANGUAGE OF NUMBERS 
AND THEY HAVE NO NEED TO SMILE 
as a sequence of points on the curve 


y? +y=x — x (mod 751). 


using the method described above. This is a quote from the Dr. Who episode Logopolis. 
(b) Translate the following sequence of points on the curve in part (a) 


(421, 013)(361, 367)(621, 220)(283, 321)(421, 013)(484, 214)(461, 283) 
(324, 368)(201, 370) 

(461, 283)(261, 663)(501, 220)(543, 314)(484, 214)(562, 174)(501, 220) 
(283, 321)(543, 314) 


ElGamal Elliptic Curve Cryptosystem 


The preceding exercise produces an encryption that is extremely easy to decrypt. We need 
to do better if we want our messages to stay out of the hands of the shadow creatures. So 
we must do some more work on these messages. In short, we must think about the ElGamal 
cryptosystem. Two other methods are also given in Koblitz [55]. 

Suppose that John on Babylon 5 wants to send a message to Delenn on Minbar - without 
the shadow creatures understanding the message. An elliptic curve E(F,) is made public. A 
generator g of the group G of this elliptic curve is chosen, if possible. If this is not possible, 
John and Delenn want the subgroup (g) generated by g to be large (near the order of G). 

Delenn picks a large integer a € G with 0 < a<|G|. Then a is her secret deciphering key. 
She makes public the enciphering key ag. 

John chooses an integer k. To send a message corresponding to a sequence of points P on 
the elliptic curve E(F,), John sends a sequence whose entries have the form (kg, P+ kag), 
for k in a list of random integers. He does not need to know a to compute k (ag), since he 
knows both k and the public key ag. 

Then for Delenn to decipher an entry of the message, she computes a(kg) = k(ag) and 
subtracts it from P + kag. 
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If the shadow creatures could find a from ag, they could decipher the message too. This 
is called the “discrete logarithm problem” for this elliptic curve situation. It is presumably 
too hard for the shadows to solve - assuming we choose large enough elliptic curves. 

What is the advantage of elliptic curve cryptography over RSA described in Section 4.1? 
The public keys can be much smaller than those for RSA. 


Exercise 8.5.16 Suppose that you are Dumbledore. Use the preceding ElGamal cryptosystem 
and the elliptic curve in Exercise 8.5.15 to send the message WELL DONE SLYTHERIN to 
Professor Snape. Take g=(0,0). You, Dumbledore, choose the following random list of 
integers to use as the integers k for your message: 


290, 437, 129, 484, 312, 206, 435, 508, 151,335, 488, 501, 411, 429, 162, 535, 223. 
Professor Snape chooses a= 361. His public key is 361(0,0). What is the sequence of pairs 
of points (kg, P + kag) that Dumbledore sends? 

Exercise 8.5.17 Show that, for odd prime p, the Legendre symbol (2) from equation (8.34) 
defines a group homomorphism f(n) = (z), mapping the multiplicative group F, onto the 
group {+1} under multiplication. Show that the kernel has order (p — 1)/2. 


Exercise 8.5.18 Show that x* + 1 is always reducible in F , [x], for all primes p. 


Hint. For a prime p > 2, there are three cases: either —1, 2, or —2 is a square mod p. In each 
case there will be a factorization as a product of two quadratic polynomials. For example, 
in the second case, if 2=a’ (mod p), x* + 1= (2x? + ax+1)(2? —axr+1). 


To end this section, which is really the end of this book, we include Figures 8.22 and 
8.23 which are two pictures of level “curves” of y’ — x? — x +1 (mod 29). Modern art? 


Figure 8.22 Level “curves” of y — x — x +1 (mod 29) 
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Figure 8.23 Smoothed level “curves” of y? — #* — x+ 1 (mod 29) 


Figure 8.24 is a photoshopped version of the level curves of 
(y + 2x)* + (x — 2y)* (mod 101). 


The cover of this book involves another photoshopped version of Figure 8.22. 
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Figure 8.24 A photoshopped version of the level curves of (y+ 2x)* + (a — 2y)* (mod 101). 
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distributive, 10, 159 

divides, 19 

division algebra, 223 

division algorithm, 21, 182 

divisor, 19, 181 


DNA, 134 

dodecahedron, 56, 122, 150, 269 
double coset, 98 

double helix, 134 

Dr. Who, 295 

dual group, 129 


echelon form, 206 

eigenvalue, 97, 133, 213, 233, 239, 240, 
268, 272, 273 

eigenvector, 272, 273 

Eisenstein series, 288 

elementary divisor, 211 

elementary matrix, 209 

elementary row operations, 206 


elementary symmetric polynomials, 117, 204 


ElGamal cryptosystem, 295 
elliptic curve, 282 

empty set, 6 

encryption, 124 

energy, 133, 137, 140, 141 
equivalence class, 32 
equivalence relation, 31 
error-correcting code, 258 
Euclid, xv 

Euclid’s lemma, 23 

Euclidean algorithm, 22, 180, 184 
Euclidean domain, 184 

Euler, xvii, 26, 135 

Euler phi-function, 60, 64 
Euler’s criterion, 202 

Euler’s identity, 104 
Euler-Lagrange equation, 136 
even permutation, 85 
expander graph, 269 
expansion by minors, 227, 229 
exponent, 65 

extension field, 171 

external direct product, 113 
extremal, 136 


factor group, 99 

factor ring, 174 

feedback shift register, 185, 186, 251 
Fermat’s last theorem, xvi 

Fermat’s little theorem, 95 
Fibonacci numbers, 17, 254 

field, 167 


field generated by an element of a larger field, 


230 
field of fractions or quotients, 203 
field of rational functions, 204 
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finite-dimensional vector space, 215 

finite field, xvi, xvii, 168, 175, 185, 221, 232, 
233, 236 

finite logarithm, 73 

finite subgroup test, 69 

finite upper half plane, 266 

finitely generated group, 211 

first principal of mathematical induction, 14 

fixed point set, 116 

floor, 20 

formal power series, 164 

Fourier, 106, 130 

Fourier transform, 130, 263, 265 

fractional linear transformation, 266 

free group, 56, 153 

free module, 217 

Frobenius, 114, 223, 238, 280 

Frobenius automorphism, 238 

function, 35 

functional, 136 

fundamental group of a graph, 152 

fundamental theorem of Abelian groups, 212 

fundamental theorem of algebra, 232 

fundamental theorem of arithmetic, 23, 180 

fundamental theorem of Galois theory, 240 

fundamental theorem on symmetric 
polynomials, 117, 204 


Godel incompleteness theorem, 3, 197 
Galois, 243 

Galois field, xvii, 222 

Galois group, xv, 92, 101, 204, 239 


Gauss, xiii, xvii, 5, 15, 24, 26, 106, 206, 222, 232 


Gaussian elimination, 206, 210 
Gaussian integers, 12, 161 


general linear group, 61, 70, 97, 104, 116, 150, 


210, 221, 229, 253, 266, 273 
generator matrix, 258 
generators of a group, 49, 52, 55, 69, 73, 79 
geodesic, 135, 266 
Golay code, 264 
golden ratio, 254 
golden rectangle, 254 
Google bombing, 280 
Google matrix, 278 
Gram-Schmidt process, 273 
graph, 55 
graphs of Lubotzky, Phillips, and Sarnak, 270 
greatest common divisor, 21, 23, 83, 184 
group, 43 
group action on a set, 114 


group algebra, 131, 223 
group of an elliptic curve, 285 
groups of order less than or equal to 15, 148 


Hadamard matrix, 262 

Hamilton, 112, 222 

Hamming, 257 

Hamming code, 260 

Hamming weight, 257 

Hasse diagram, 34 

Hasse’s theorem on the number of points on an 
elliptic curve, 292 

Heisenberg group, 151 

Hermitian matrix, 273 

Hilbert, 153, 174, 272 

Hilbert’s problems, 153 

homogeneous linear recurrent sequence, 251 

homomorphism, 102, 104, 189, 190 


icosahedron, 56, 63 
ideal, 173 
ideal generated by a set, 174 


identity, 10, 37, 43, 63, 159, 160, 162, 169, 221, 


224, 227, 258, 285 
identity matrix, 57, 61 
image, 35, 91, 104, 105, 190, 220, 258 
imaginary numbers, 5 
impulse response sequence, 251 
inclusion-exclusion principle, 41 
indeterminate, 86, 115, 163, 164, 180, 199 
induction hypothesis, 15 
inductive definition, 16 
infinite dimensional vector space, 215 
infinite set, 37 
injective, 36 
inner automorphism, 92 
inner product, 130 
integers, xv, 10-12 
integers modulo n, 28, 58, 98 
integral domain, 166 
internal direct product, 113 
intersection, 6 
invariance under a transformation, 138 
inverse, 10, 43, 159 
inverse function, 37 
inverse image, 39, 192 
inversion of Fourier transform, 132, 265 
irreducible polynomial, 181 
isomorphism, 89, 90, 104, 107, 170, 189, 190, 

218, 222, 235, 238 


Jacobi identity, 165 
Jordan form, 213 
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kernel, 102, 103, 190, 220, 259 
kinetic energy, 137 

Klein, 50 

Klein 4-group, 54, 60, 96, 108, 147 
Krawtchouk polynomial, 265 
Kronecker delta, 228 


Lagrange’s theorem, 95 

Lagrangian, 137 

Laplace expansion of determinant, 228 

Latin square, 54 

lattice in the plane, 288 

leading coefficient of a polynomial, 181 

least common multiple, 82, 109 

left group action, 114 

Legendre symbol, 291 

Lehmer, 244, 250 

Lie bracket, 165 

Lie group, 152 

line at infinity, 284 

linear combination, 23, 133, 215 

linear congruence, 29, 75, 77, 194 

linear congruential random number 
generator, 245 

linear function or mapping, 36, 62, 93, 103, 218 

linear map, 218 

linear recurrent sequence, 251 

linearly independent vectors, 215 

Lorentz group, 142 

Lubotzky, Phillips, and Sarnak graphs, 271 


majority rule, 265 

mapping or map, 35 

Markov chain, 274 

Markov chain Monte Carlo methods, 244 
Markov matrix, 274 
mathematical induction, 14, 16 
mathematics as a language, 5 
matrix exponential, 107 
matrix multiplication, 36 
matrix of a linear mapping, 219 
maximal ideal, 177, 183 
Maxwell’s equations, 138, 141 
Mersenne prime, 4 

methane, 57, 88 

metric, 258 

minimal polynomial, 231, 232 
minor, 227 

modular arithmetic, 26 
modular form, 266, 271, 288 
modular group, 51, 266 
module, 217 


momentum, 140 

monic polynomial, 181 

monomial, 165 

monster group, 101, 152 

Mordell’s theorem on elliptic curves, 287 

multilinear function, 224 

multiple, 19 

multiplication modulo n, 28, 59 

multiplicative group of finite field is cyclic, 79, 
202, 238 

multiplicity of a root of a polynomial, 183 


n factorial, 16 

natural numbers, 9, 12, 24 

natural projection, 105 

Newton's law, 137 

Noether, xvii, 104, 131, 135, 174 
Noether’s theorem, 139 
non-Euclidean geometry, 5, 266 
non-singular elliptic curve, 284 
non-singular square matrix, 62, 107, 207, 229 
norm, 129, 134, 222, 266 

normal subgroup, 95 

normalizer, 119 

nullity, 220 

nullspace, 220 

number of elements in a finite set, 37 


octahedron, 56, 122, 151, 267 

odd permutation, 85 
one-parameter group, 107, 141 
one-step subgroup test, 68 
one-to-one, 36 

onto, 36 

orbit, 83, 116 

orbit/stabilizer theorem, 117 

order, 11 

order of a finite group, 65 

order of an element in a group, 66, 77 
orientation of the coordinates, 227 
orthogonal group, 57 
orthogonality, 131 

outer automorphism, 92 


p-group, 119 

Polya enumeration theory, 120 
page rank, 279 

parallelepiped, 227 
parallelotope, 227 

parity check matrix, 259 
partial order, 34 

partition, 33 

Pascal’s triangle, 40 
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period of a linear recurrent sequence, 252 quotient group, 99 
permutation, 44, 81, 83 quotient ring, 174 
permutation matrix, 92, 209 
permutation notation, 44 Ramanujan, 272 
Perron theorem, 280 Ramanujan graph, 269, 270 
photo number 51, 134 random number generator, 245 
physics, 50, 135, 152, 273 range, 220 
pigeonhole principle, 39 rank of a free module, 217 
pivot, 206, 217 rank of a linear map, 220 
Platonic solids, 56 rank of an Abelian group, 287 
Poincaré, 266, 270 rational canonical form, 213 
point at infinity, 284 rational functions, 203, 204, 230 
polynomial in two indeterminates, 165 rational numbers, xvi, 5, 12, 15, 158, 167, 202, 
polynomial in n indeterminates, 86, 115 231, 241, 254, 286 
polynomial ring, 163, 175, 180 real numbers, xiii, xvi, 5, 8, 12, 102, 105, 141, 
poset, 34 158, 168, 226, 231, 247, 273 
poset diagram, 34 reciprocal polynomial, 241 
potential energy, 137 reducible polynomial, 181 
power method to find dominant Reed-Muller code, 262 

eigenvector, 279 Reed-Solomon code, 262 
power series, 18, 164 reflexive, 31, 34 
powers of group elements, 64 relation, 30 
presentation, 55, 153 relation between group elements, 49, 55 
prime, 20 relation between rank and nullity, 220 
prime ideal, 176 relatively prime, 21 
primitive polynomial, 185, 186, 238 restriction of a function, 38 
primitive root, 79 resultant, 285 
primitive root of unity, 233, 261, 264 right group action, 114 
principal ideal, 174, 210, 260 ring, 159 
principal ideal domain, 210 ring generated by an element of a larger 
principle of least action, 137 ring, 187, 230 
product of ideals, 179 ring of polynomials in several indeterminates, 
projection, 105, 112 204 
projective general linear group, 270 roots of a polynomial, xv, 101, 164 
projective space, 284 rotation group, 141 
projective special linear group, 152 row operations, 206 
proof by contradiction, 4 row rank of a matrix, 217, 258 
proper ideal, 173 row-reduced echelon form, 206 
proper subgroup, 67 RSA cryptography, 124, 125 
proper symmetry, 56 ruler and compass constructions, 241 
pseudo random numbers, 244 Russell’s paradox, 6 
public-key cryptography, 124, 125, 

293, 295 scalar, 214 
Pythagoreans, 5, 24 Schreier graph, 96 

Schroedinger equation, 138 

quadratic formula, 101, 201 Schur decomposition, 273 
quadratic reciprocity law, 311, 312 second principle of mathematical induction, 16 
quadratic residue code, 264 semi-direct product, 148 
quantum physics, 50 shidoku or junior sudoku, 143 
quaternion, xvi, 112, 222 sign of a permutation, 86 
quaternion group, 112, 147 similar matrices, 93, 213, 214, 221, 273 


quintic equation, 101, 241 simple algebra, 223 
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simple field extension, 239 torus, 111, 195, 288 

simple group, 100 transcendental extension field, 230 
Smith normal form of a matrix, 211 transitive, 11, 31, 34 

solvable group, 241 translation-invariant metric, 258 
space group, 72, 152 transpose of a matrix, 57, 97, 141, 143, 219, 226, 
span of set of vectors, 215 272, 281 

spanning tree, 153 transposition, 85 

special linear group, 152 triangle inequality, 12, 258 

special orthogonal group, 56, 57 trichotomy, 11 

special relativity, 142 truncated icosahedron, 152 

spectral theorem, 133, 273 two-step subgroup test, 68 
spectroscopy, 134, 273 

spectrum, 133, 134, 142, 268, 272 undirected graph, 55 

splat, 129 union, 6 

splitting field of a polynomial, 234 unique factorization into primes, 23 
sporadic group, 101 unit, 20, 59, 162, 180 

stabilizer, 116 unitary matrix, 273 


state vector of linear recurrent sequence, 253 


statistical tests, 255 Vandermonde determinant, 88 
subfield, 171 vector, 214 


subfield test, 171 vector space, xiii, 61, 108, 130, 131, 185, 214 
subgroup, 67 
subring, 161 


very useful polynomial, 86 
vibrating system, 133, 142 


subring test, 161 viruses, 58 

subset, 6 volume function, 227 

subspace, 216 voting, 244, 264 

sudoku, 143 

sum of ideals, 179 webpages, 276 

surjective, 36 websites, 276 

switching functions, 120, 264 Wedderburn theory, 224 

Sylow p-subgroup, 119 Wedderbumn’s theorem on finite division 
Sylow theorems, 119 rings, 173 


symmetric, 31 
symmetric group, 81 


Wedderburn’s theorem on simple algebras, 223 
Weierstrass function, 287 


symmetric matrix, 55 well defined, 28, 35 
symmetric polynomial, 117 well-ordering axiom, 12 
symmetry, xiii, 47, 48, 50, 54, 152, 154 Wilson’s theorem, 64 


system of Euler-Lagrange equations, 137 word problem, 153 


tessellation, 270 
tesseract, 9, 108 
tetrahedral group, 87 
tetrahedron, 56, 87, 153 
torsion subgroup, 287 


X-ray diffraction spectroscopy, 134 


zero divisor, 10 
Zorn’s lemma, 232 
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Abstract Algebra with Applications provides a friendly 
and concise introduction to algebra, with an emphasis 
on its uses in the modern world. The first part of 

this book covers groups, after some preliminaries 

on sets, functions, relations, and induction, and 
features applications such as public-key cryptography, 
Sudoku, the finite Fourier transform, and symmetry 

in chemistry and physics. The second part of this book 
covers rings and fields, and features applications such 
as random number generators, error-correcting codes, 
the Google page rank algorithm, communication 
networks, and elliptic curve cryptography. 


The book’s masterful use of colorful figures and 
images helps illustrate the applications and concepts 
in the text. Real-world examples and exercises will 
help students contextualize the information. Meant 
for a year-long undergraduate course in algebra for 
math, engineering, and computer science majors, the 
only prerequisites are calculus and a bit of courage 
when asked to do a short proof. 
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